Web Site Security With repoze.who and repoze.what

This article first appeared in the May 2009 issue of Python Magazine and has been slightly updated. The contents of the article are only applicable to repoze.who 1.0 and repoze.what 1.0, not repoze.who 2 and repoze.what 1.1 which are under development as of this writing.

Have you ever created a Web application? If so, it’s very likely that you have at one time or another faced “the security problem”; whether to create and maintain a homegrown security sub-system, or to learn to use framework-specific security mechanisms (which may not be as flexible as you wish).

Securing Web applications shouldn’t be a problem. This article explores a highly extensible alternative which you can learn once and use in arbitrary applications, regardless of the Web framework used (if any!).
Continue reading

“WSGI from Start to Finish” at EuroPython 2010

If you’re a Web Application Developer using Python, you may be very interested in the tutorial I am presenting at EuroPython 2010: “WSGI from Start to Finish: How to use the power of WSGI to solve problems your framework cannot solve”.

Your favorite Web framework is not able to meet all your needs, all the time; some problems cannot even be solved at the framework level. In such situations, the Python Web Server Gateway Interface may save you a lot of time and trouble, giving you the opportunity to implement an elegant solution or integrate existing framework-independent third party solutions.

And chances are, a better WSGI-based alternative exists for something your framework is apparently good at. WSGI is a very powerful technology, and the kind of things you can do with it may surprise you.

It doesn’t matter if you know little about WSGI or nothing at all, because when I say “from start to finish” I really mean it. In this half-day tutorial, I’ll try to cover both simple and complex real-world situations solved with WSGI. The tutorial is relevant for Django/Pylons/TurboGears/etc users, and for those who don’t use a Web framework at all!

Getting back on track

Yes, I’m alive.

Since the second half of last summer I’ve been inactive in the Free Software arena. No commits, no emails from me in the last few months which may indicate that the projects are dead. So I wanted to write to let you know that I have no plans to stop maintaining any of my projects. I will start to catch up with all the things I’ve missed in the projects I normally contribute to and the projects I develop alone.

The reason why you’d heard nothing from me is that I left Spain to move to Oxford, in order to work at the cool company behind 2degreesnetwork.com. The removal was the most time-consuming and stressful thing I’d ever done, but after one month working here, I’m happy to say that it was worth it. The atmosphere is just like I thought Web 2.0 companies were, and I am surrounded by nice and talented people. I can’t be happier.

Well, back to the projects, I had to wait a lot to get access to the Internet at home, but I got it a couple of weeks ago and have been catching up (slowly) with the pending stuff. I still have a huge stack of unanswered emails, for example.

For the last couple of weeks I was working fulltime on repoze.what 1.1 and repoze.what-django. I hope to finish the documentation and get the first alpha releases out very soon; the code itself is pretty much ready and, as usual, fully tested. I didn’t have plans to do a repoze.what 1.1 release anytime soon, but while developing repoze.what-django I found myself implementing something which would be useful outside Django (i.e., ACLs) and thus I decided to move it to repoze.what.

After that, I want to improve the auth documentation in TurboGears 2. repoze.what-pylons is the crucial part of the repoze.what integration in TG2 and it’s fully documented, but duplicating part of those docs won’t do any harm and adding some tips and tricks would be nice. I started doing that some months ago but never committed it; I have to finish it this time.

Then I’d like to make repoze.what-pylons take advantage of the new features in repoze.what 1.1, like repoze.what-django already does.

That’s it for the foreseeable future. Next year I really want to get serious with Booleano and PyACL.

Koren’s SVD++ Python Implementation

I recently had to implement a recommender system for the Netflix Prize. Out of the best known models, I chose Yehuda Koren’s SVD++ model as published on the paper entitled “Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model” (the version that doesn’t take into account temporal effects; I’d have implemented the complete model, but couldn’t due to time constraints).

I named this Python-based project “wooflix” and you can download it from code.gustavonarea.net. It ships with a command-line interface and basic documentation, including the design document.

It’s the first project, as far as I know, that uses Booleano. With it, you can get random movie recommendations and filter them, like this:

# Get 5 movie recommendations for user #7, at least those published after 2001
wooflix recommendations 7 --max="5" --filter="movie:year > 2001"

Keep in mind that I won’t offer support for it; I’m publishing because I thought it might be useful for some people, but I have no intentions to work on it in the future.

Announcing Booleano

I am proud to announce the first alpha release of Booleano, a Python-based interpreter of boolean expressions:

Booleano is an interpreter of boolean expressions, a library to define and run filters available as text (e.g., in a natural language) or in Python code.

In order to handle text-based filters, Booleano ships with a fully-featured parser whose grammar is adaptive: Its properties can be overridden using simple configuration directives.

On the other hand, the library exposes a pythonic API for filters written in pure Python. These filters are particularly useful to build reusable conditions from objects provided by a third party library.

It’s been designed to address the following use cases:

  1. Convert text-based conditions: When you need to turn a condition available as plain text into something else (i.e., another filter).
  2. Evaluate text-based conditions: When you have a condition available as plain text and need to iterate over items in order to filter out those for which the evaluation of the condition is not successful.
  3. Evaluate Python-based conditions: When you have a condition represented by a Python object (nothing to be parsed) and need to iterate over items in order to filter out those for which the evaluation of the condition is not successful.

It is a project I found necessary while working on repoze.what 2, which I’ve been developing for the last few months in my spare time. This release is absolutely usable, but lacks documentation because I needed this release out for a (small) project I need to work on ASAP (it will depend on Booleano). The next release will ship with a nice documentation, I promise.

Dell is ashamed of its Ubuntu-powered laptops

My laptop was slow while running my chain and ball KDE 4, and also got some things broken recently (e.g., battery, screen hinges), so I decided to buy a new one last week before it leaves me stranded. And soon enough I realized that I had two options:

  • Buy it in a place where every single computer ships with Windows, so that I could claim a refund. I didn’t care about the money: I just wanted to mess with that kind of vendors and file a lawsuit if I didn’t get it on good terms, to encourage people to do the same thing and thus contribute to do away with the Windows Tax.
  • Purchase it from a Linux pre-installed vendor, to support them. Even if they pre-installed a freedom-trampling system like Windows, it’d be good to show them that Freedomware worths it.

I liked both options alike, so I based my decision on the computer specs and costs, not on the vendor/manufacturer.

I decided to get a Dell XPS M1330, one of the two Ubuntu-powered computers that I remembered Dell sells in Spain. So I visited dell.es/ubuntu and was surprised to find just a couple of netbooks! Change of plans; now I’ll have to get it with Windows and claim a refund, I told myself.

So the first step was to get a proof that I was imposed the operating system when I bought the laptop. Sales representatives were available for a chat, so I asked them how could I get a Dell XPS M1330 without Windows. The surprising answer was that it was available with Ubuntu and pointed me to configure2.euro.dell.com/dellstore/! Plans changed one more time; back to the original plan, get it with Linux.

I obviously asked why it wasn’t listed on dell.es/ubuntu. The sales rep said that s/he didn’t know why and that s/he will forward my query to the relevant department. I bought the laptop with Ubuntu that day and that was it.

Today, out of curiosity, I went to dell.es/ubuntu and found that it hasn’t changed! The link the sales rep provided me with the other day still works but the laptop is not listed. And the same happens in dell.fr/ubuntu, dell.co.uk/ubuntu and dell.de/ubuntu, for example.

This can hardly be a mistake. Why the heck does Dell hide some of the few Linux-powered computers they sell now? Maybe due to threats from Microsoft? After all, it’s well-know for its monopolistic practices.

PS (April 18th @ 14:00 UTC): The link above to configure2.euro.dell.com/dellstore/ doesn’t work at times today, so here’s an screenshot if it doesn’t work for you:

PS (April 19th @ 18:30 UTC): This is an screenshot of the random error I warned about yesterday (which I took just in case), before reaching Digg.com’s front-page:

Now, almost 20 hours after reaching Digg’s front-page, the link no longer works (not even at times, as yesterday) and a better formatted page is displayed instead:

I don’t know if the different error pages actually mean something, but my point is that the link is now dead.

Are you a Software Developer or a Software Engineer?

(Practice has changed my view on the subject since I wrote this ranty and simplistic post. As of 2014, I still agree with the essence of this article, but I’m planning on creating a follow-up that better addresses the complexity of this subject.)

Tired of the indeliberate usage of the term “software engineering”, where “software developer” and “software engineer” seem to be exchangeable, I’m writing this article to explain what I think Software Engineering really is.

But first, let’s remember some basic terminology:

Anyone who can create a program in at least one programming language, regardless of the use of a systematic approach (if any).
Software developer
A software developer is a programmer who doesn’t only care about about simply writing code, but also cares about (although may not be directly involved in) the requirement analysis, the functional specification, the design, the testing, the deployment and the maintenance of the software product they work on. Disciplined software developers usually follow a software development methodology, like XP.
According to the Wikipedia (bolds are mine): “Engineering is the discipline and profession of applying technical and scientific knowledge and utilizing natural laws and physical resources in order to design and implement materials, structures, machines, devices, systems, and processes that safely realize a desired objective and meet specified criteria.”

Both programmers and software developers qualify the software progress. They can’t often meet deadlines nor track process because they don’t know for sure where they are nor where they should be. Qualification is subjective and absolutely imprecise, so you can only have subjective and imprecise answers to precise questions like “when it’s going to be ready?” (to which the most common answers are “soon” or “when it’s ready” in the freedomare world).

When you travel by car, what can you do to find how far you’re from the destination and how much time is left? You have to measure. If you find a sign that states that you’re 20Kms away from your destination and you measure the current car’s speed (well, your car does so for you) and it turns out to be 60Km/hour, then you’ll realize that if you keep the speed you’ll arrive in 20 minutes. If you don’t measure, you can’t tell if you’re on time and you can’t even avoid getting late next time (to improve, you need to know the previous measures!).

If you quantify, you will find the real status of a given process and whether you’ll reach your goals within the desired parameters (time, money, etc.). If you quantified and analyzed such measures, you will be able to execute the right corrections in order to improve the process and thus reach the goals within the desired parameters, or at least reduce the difference between the desired parameters and the final results (this is, reduce risk). And that’s not specific to software.

So, the difference between a disciplined software developer and a software engineer, is that the former qualifies and the later quantifies. In a software engineering project, when a process is going wrong, it’s found (the sooner or later) thanks to software metrics (or “software measurements”) and the appropriate steps are taken to reduce risk. In a software development project, the process is not measured and software product is delivered out of at least one parameter (over-budget, with less features, after the deadline, etc.).

I don’t think you need a diploma that says you’re a software engineer (or hold a position ending by “Software Engineer” in a organization) to call yourself “software engineer”, unless required by local law. But you need to be a disciplined software developer who measures the software process and make decisions based on an objective analysis of the relevant measures.

Learn more

There are good resources out there to learn more about software measurement. The one I strongly recommend is “Software Measurement” by Christof Ebert and Reiner Dumke (ISBN: 978-3-540-71648-8). This book is a great introduction to software measurement and covers the four kinds of software metrics (project, process, product and people metrics). I think it’s a must-read for anyone involved in software processes and wants to improve continuously (which can only be achieved by measuring!).

But there are also good resources on the Web, like the ones listed below. Unfortunately, I couldn’t find something like the book above, but online.

Auth: What you may expect from TurboGears 2

Those still using TurboGears 1 will find a big improvement in the authentication and authorizarion area when they upgrade to version 2: TurboGears 2 ships with an easy-to-use, pluggable, extendable and well-documented authentication and authorization system, powered by repoze.who and tgext.authorization (whose documentation will be available along with TurboGears’ very soon).

Some of the features include:

  1. You may store your users’ credentials where you want – in a database, an LDAP server, an .htacess file, etc.
  2. You’ll be able to store your groups and permissions where you like too, but also use as much as group and permission sources as you need. What if your application’s main database already stores your groups and permissions data, but the company’s IT department needs to reuse their Htgroups file in the application? That would be a piece of cake.
  3. You’ll be able to manage your authorization settings with an API independent of the used source(s) (databases, Ini files, etc). Yes, add/edit/delete groups and/or permissions.
  4. You’ll be able to grant permissions to anonymous users (hopefully available this week).
  5. Do the above and more without writing too much code.

Right now there’s only the SQL plugin, so in the mean time you may still only store your groups and permissions in a SQLAlchemy or Elixir managed database, but very soon we’ll have the Ini plugin (to store groups and permissions in *.ini files) and even more.

In the future you’ll also be able to get OpenId authentication with a couple of lines of code (there’s a work in progress) and possibly OAuth authorization too.

And you may give it a try now! You can either try the latest code from the trunk or wait for the first TG2 beta which will hopefully be released in a couple of days.

The repoze.who LDAP plugin will be an official plugin

Some weeks ago I was invited to make repoze.who.plugins.ldap an official repoze.who plugin, which means that:

  • The license will change. It will use Repoze’s.
  • The development tools will be migrated from Launchpad (bug tracker, repository, etc).
  • The LDAP plugin’s documentation will be included into repoze.who’s.
  • It will be maintained by Repoze commiters, and I’m one of them.

I’ve not started the migration, but I hope to start in a few days.

Server load: DreamHost vs WebFaction

I got tired of the slowness of DreamHost servers and its consequences, such as sites being down or extremely slow from time to time. So I decided to migrate progressively over the next weeks to WebFaction because:

  1. The costs are the same.
  2. They don’t overload their servers.
  3. They have an excellent reputation in the TurboGears community.

So here I offer a comparison on the server load in my DreamHost shared host vs my WebFaction shared host.
Continue reading