Tag: frameworks


Introducing Lokai

May 24th, 2011 — 6:38pm

Lokai has finally hit the streets. This is what I call a business process management package. That is, it helps manage business processes. To me, that means two things, activities with some traceability of actions taken, and documentation. There’s a tad more to it than that, of course. It must be possible to relate things to each other; it must be possible to limit people’s access (or, conversely, allow people to see the things they need to see); and it must be extendable.

The first two go hand in hand in Lokai. All data is structured around multiple hierarchies. Each node in the hierarchy can be documentation, activity, or anything else. That gives us the ability to create different structures for different tasks, so things are related to each other in a way that makes sense in the context. At the same time, that gives us the access control mechanism. A user allocated to a point in the hierarchy has access to all points below that.

Extendability is is a key factor in making this package useful, and the component-based approach taken in Lokai has already produced a special-purpose web shop. More will come.

Producing an open-source package is a challenge. I’m still trying to work out how much, and what level of, documentation to produce. I guess I need more users to ask questions before I can get a good grasp of that. More immediate, though, are the issues raised by using Lokai as the basis for its own web site. I want to give registered users access to tickets, for example, but, on the one hand, I don’t want to have to explicitly give permission to each user for each ticket, and on the other, I don’t want to give write access to the node that contains the tickets. As it happens, this issue of grouping people is clearly appropriate for any organisation or project with more than a hand full of users, so Lokai springs to life with at least one ticket to be resolved, and it will be the better for it.

I’m looking forward to seeing how others will want to use it.

Comments Off on Introducing Lokai | Development

Deconstructing URLs

September 19th, 2010 — 8:41pm

As a follow-on to my investigation (if that’s the right word) into URL dispatch in Python frameworks I thought I would look at how an application discovers, calculates or otherwise works out what URL to use to refer to its own objects. The application wants to provide a link to an object edit page, say, so it must somehow know how to formulate such a link and where to find the contextual information that places the application in the particular environment it is running in. Let’s start by deconstructing an example.

I have an application that currently uses a URL something like http://example.com/a/projects/{identifier}/edit. This breaks down into the scheme (http:), the network location (example.com) and a path (/a/projects/{identifier}/edit). The path is an absolute path according to rfc 1808, because it starts with “/”, and it is this path that we want to recreate somehow.

As it happens, this path has three different elements

  • /a/ – I used this to manage the path that cookies belong to. In principle, any path that did not begin with /a/ could be used for static files, whereas paths that did start with /a/ would be processed by an application using the cookie to manage the session. In practice, it defines a name-space that allows us to put more than one instance of an application environment onto a single net location.
  • projects/ – This points to a particular resource handler used for project management. There could be other resource handlers running in the same environment. In effect, this part of the path could be thought of as narrowing down the selection of objects referenced by the remainder of the path. This creates another name-space that distinguishes available resources. We could, potentially, have a URL that looks like http://example.com/a/other_world/{identifier}/edit where the {identifier} in this case comes from a different set of identifiers than the projects set, and the edit element implies quite different functionality.
  • {identifier}/edit – The final element of the URL that is actually interpreted by some resource handler code to support an identified object.

The important point here is that the first two parts of this URL (/a/projects/) are irrelevant to the resource handler. This is, in effect, the SCRIPT_PATH of the CGI definition, and {identifier}/edit is the PATH_INFO. Clearly the SCRIPT_PATH can be changed to reflect the context, and it can be as long or as short as required. So long as it links to the correct code to interpret PATH_INFO the URL works.

I am, of course, making an assumption here. The URL I have deconstructed is rather old fashioned in the sense that the structure seems to represent the application in some way. I could, in principle, write /a/{identifier}/projects/edit. The /a/ still has to come first, because in my example it is being interpreted by the http client for returning cookies, but projects/ can be anywhere I like. This doesn’t make much difference, except to emphasise two things: there is going to be some part (/a/) that is dependent on the server environment, and some part (projects/) that is going to be dependent on some sort of framework environment. The underlying problem remains the same – how to feed these two parts into the URL generation process without making the resource handler aware of the details.

I need to do two different things. I want to serve more than one resource type from the same environment, and I want to run more than one environment from the same net location. The second I could solve by virtual servers. I (simply?) configure the http server to direct www.example.com to one place and software.example.com to another. That’s fine if I have full control over the server and is probably the ‘best’ solution. The first could also be solved in the http server if the URL is strictly hierarchical, but it can’t be avoided by limiting the site to only one resource type. Generally I am at least likely to want to refer to user objects (for access control, capturing addresses, credit cards, whatever) and, say, product objects (for the users to buy). At the very least that means choosing the object names very carefully for any particular site. On different sites, ‘users’, ‘customers’, ‘clients’, ‘patients’, or the same in the singular, may be valid options, but I have to choose just one and stick with it. (A quick read of this style guide is worth it for the reminder.)

In general, there is a three step sequence of places that might do URL dispatch – http server, application framework, and resource handler. The http handler communicates with the applications it serves using CGI. For our purposes here, the SCRIPT_PATH tells us the fist part of the path we eventually want to create, so that is what we must use in the next step.

The application framework could be null if all the routing is done in the http server, but we might need to provide something if we don’t have access to the server, or if we want to be reasonably dynamic. This framework will have a URL dispatcher, and this dispatcher may support named routes. The resource handler could delegate all the routing to the framework. This works fine, because the framework extracts the useful parts of the path and presents them to the resource handler as parameters. The resource handler, however, has to ask the framework to do URL generation and this locks the resource handler firmly to the framework.

I rather like the idea of a resource handler that consists of a dispatcher and code combination that is dedicated to handling a single resource type. I can plug this in to a framework, or serve directly from an http server. Of course, if the handler provides a user interface to a browser, then there may be some conventions to follow, or code to share, but that would be a necessary consideration whatever was done. The framework becomes little more than a dispatcher that looks like an http server. It creates an appropriate SCRIPT_PATH to hand down to the resource handler, and the resource handler can handle the remaining parts of the path.

I think I’ll work more on this idea.

1 comment » | Uncategorized

In celebration of filepath

June 26th, 2010 — 4:44pm

I was reminded the other day of some of the problems that we (the software industry, and the human race in general) have in thinking of software as some sort of engineering. The term Software Engineering has a nice ring to it, but the reality is disappointing. As Jeff Atwood so nicely pointed out, the advantage of, er, real world engineering is that it has immutable laws of physics. In software the programmer is free to invent his or her own laws as the project progresses. Of course, it is possible to relate physical laws to the software world, but, not, I suspect in the way that I, or Jeff Attwood, have in mind.

The particular issue I’m looking at here is software re-use. This is where project A and project B can both use some component, even though the two projects are very different. The mutable laws of physics thing comes in when the component in question has an API that is tuned to project A and therefore doesn’t cut it for project B. This is a known problem, with a set of causes, discussed by Douglas C. Schmidt in 1999, and one key aspect of software engineering is the disciplined approach that tries to address these causes. Interestingly, Schmidt picks up on two problems that are relevant here. One is that re-usable software has to be deliberately written that way. This means that the author has to understand a wide range of use cases, and have the funding or flexibility to be allowed to do it that way. The other is that re-usable components have to be attractive. Once the component has been written others must be attracted to it. For this, Schmidt uses the concept of a “re-use magnet” and he thought that the open source development process is a good and effective way of creating re-use magnets.

There’s a whole can of worms that opens up when software engineering is discussed, but that is not why I’m writing this. What reminded me of all of this is the recent release of filepath. This is a package that makes file handling that bit easier by hiding the details of the Python os.path module. In that way alone it is a potential re-use magnet. Why I like the fact of this release, however, is that filepath used to be part of Twisted. Now, Twisted is a good and useful web frame work, but I don’t happen to use it, and I would not want to install it just for the sake of using filepath. After all, I already have lots of code that uses os.path, so I’m not going to go out of my way to dig a new component out of a bunch of software that I’m not going to use. The advantages are not enough. On the other hand, with filepath as a down-loadable component on its own the re-use magnet is no longer shielded and I can feel the attraction. What is more, now that it is out there it can grow (or shrink) to fit the wider world (as a result of more use cases and more discussion).

So, thank you to the guys at Twisted. The software re-use paradigm lives on and Chaos and Old Night have been kept a bay in one small corner of the virtual world.

Comments Off on In celebration of filepath | Development

URL Dispatch in Python Frameworks

May 24th, 2010 — 9:48am

I have been using Quixote for some time now as my web framework. So far, I have had very little incentive to change to another framework. The code is lightweight and reliable, to the extent that I generally just forget about it. Recently, though, I have hit some issues. It’s all to do with URL dispatch and, specifically, dynamic URLs that contain data. I want to be able to write URLs like /MyPage/edit, where ‘MyPage’ is the name of a documentation page, for example, or /reports/2009/04 which might bring up a list of reports for April 2009. This looks nicer than /reports?year=2009&month=4, which is what I used to do, and makes it easier for users to bookmark pages.

Quixote does URL dispatch using python objects based on a Directory class. The URL is processed left to right, and each element of the URL identifies the next Directory object. It is all quite flexible, and, of course, the structure is built in software and bears no particular relationship to folders on disc or how a project is developed. It is remarkably easy to link in work from different projects, and this is a key advantage. Normally a Directory object recognises the text in a URL by direct match with an attribute of the object, but there is a catch-all that gives the option of processing elements that it does not otherwise recognise, and I have used this to support dynamic URLs. This works well enough, but there is no way to support the generation of the URLs in the first place. When I am generating a page with links on it I want to be able to write target_url = make_url(edit_template, target_page=MyPage), or something similar, and end up with a URL that the dispatch mechanism will recognise. Obviously, I can do this by hand, but the relationship between the template and the dispatch tree exists only in my head. So I end up with buggy links, and problems if I want to make changes.

All this prompted me to look into URL dispatch mechanisms, to see what I think I need, and to find out if there is anything out there that already does the job. So, in no particular order, this is my shopping list:

  • URL can contain data and process related fields
  • A URL identifies both an object (the data or subject matter) and the action to perform (display, edit). In model/view/controller terms, the URL provides all of the information to identify data and identify the required code components.

  • Flexibility of URL design
  • There should not be any inherent restriction on the order of items in the URL, or on how code or data related items might be placed in the pattern.

  • Ability to distinguish similar URL schemes
  • The URLs

    • /ham/{some date}/spam
    • /ham/{some name}/spam
    • /ham/{some year}/{some month}/spam

    are all similar but may be significantly different in processing. I tried some thought experiments with Quixote Directory interpretation of these forms and came to the conclusion that it might be possible to handle things like this, but there are probably easier ways.

  • Linking to code points does not impose restrictions on code structure
  • I want to build a tool that is flexible and does not enforce any particular approach. One of the claims for Quixote is that the developer simply uses their own knowledge of Python. In this vein, I don’t want to force code to be stored in a ‘controller’ directory, or to insist that a particular model-view-controller structure is used.

    Exactly how code points are identified is almost a subject in itself. For now, I just want to know that the URL dispatch process is not going to be a restriction.

  • Configuration of applications from different projects
  • This is probably another view on the previous requirement. Obviously, all applications that are going to be used in whatever environment I end up with will have to be aware of some features of the environment (such as the URL generator, for example), but I have applications that have been developed over time, and I would like the ability to stitch in new applications in the future, so a reasonable configuration process is needed; one that does not require a whole bunch of rewriting to get things working.

  • Partial path handling
  • By this I mean the ability to identify an action based on the first few fields of a URL and then pass the remaining fields to another dispatcher. Actually, this is effectively what the Quixote Directory object does, working one field at a time. I guess I’m looking for something less fine grained.

  • Fields can be referenced by name
  • This is for convenience and for reducing the possibility of error. It is likely to be easy for a dispatcher to report fields as a list. It is much more useful for me if I can use names.

  • URL generation
  • And now, the main reason for looking at this in the first place, generation of a URL, given a template, or a reference to a template. The implied requirements here are:

    • the template should be directly related to the dispatch process without need for thought or invention on the part of the developer.
    • the names used for the fields to be substituted should be the same as the names used to extract the data when interpreting the URL

Most of that is fairly obvious, and there are people out there who picked up on this years ago. The next part of this saga is to review what is out there and pick an approach.

2 comments » | Development

Back to top