<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.0.4" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>lirico</title>
	<link>http://www.lirico.co.uk/wp</link>
	<description>Stephen Pascoe's weblog</description>
	<pubDate>Sun, 25 Feb 2007 02:00:58 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.4</generator>
	<language>en</language>
			<item>
		<title>Tales from Dallas II</title>
		<link>http://www.lirico.co.uk/wp/?p=22</link>
		<comments>http://www.lirico.co.uk/wp/?p=22#comments</comments>
		<pubDate>Sun, 25 Feb 2007 01:53:56 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Python</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=22</guid>
		<description><![CDATA[There&#8217;s a lot relevant to scientific computing.  For instance a very
impressive presentation by Travis Oliphant on the state of numpy and
why everyone should abandon Numeric except for legacy code.  I hope the
audio will be available on the website soon but in the mean time the
slides are here
I am considering trying to get cdat_lite [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s a lot relevant to scientific computing.  For instance a very<br />
impressive presentation by Travis Oliphant on the state of numpy and<br />
why everyone should abandon Numeric except for legacy code.  I hope the<br />
audio will be available on the website soon but in the mean time the<br />
slides are <a href="http://us.pycon.org/common/talkdata/PyCon2007/045/PythonTalk.pdf">here</a></p>
<p>I am considering trying to get cdat_lite working with numpy as people<br />
here assure me the transition shouldn&#8217;t be too hard.</p>
<p>People from the scipy vendor Enthought are here in force.  They have a<br />
framework for interactive visualisation called <a href="http://code.enthought.com/chaco/">Chaco</a> which looks<br />
great, although not really relevant for web-based development.  They<br />
havn&#8217;t tried it with basemap yet.  It was also reassuring to see<br />
everyone agrees matplotlib installation is a problem.</p>
<p>I think I&#8217;m going to start using IPython again after I saw a demo.<br />
This is an interactive python shell with enhanced introspection,<br />
system access, debugging etc.  It works particularly well with<br />
matplotlib as a visualisation environment.</p>
<p>The lighting talks this evening were both entertaining and<br />
informative.  What about ZjengoGears &#8212; the 7th generation<br />
web-framework that knows how you want to design your website?  More<br />
seriously, <a href="http://tgwebservices.python-hosting.com/">tgwebservices</a> looked like a pain-free way of supporting SOAP<br />
+ WSGI (can it really be that simple?) and there was an impassioned<br />
plea to add numpy to the standard library.</p>
<p>Oh and Eggs seems one of the buzzwords of the conference &#8212; lampooned<br />
as such at times.  I felt I might be preaching to the converted but the content went well.  There were some embarrasing technical difficulties brought on by my S5 presentation system.  I think I&#8217;ll go with the safety of ooimpress next time.  The material is available <a href="http://us.pycon.org/common/talkdata/PyCon2007/049/using_python_eggs.zip">here</a> and <a href="http://us.pycon.org/common/talkdata/PyCon2007/052/developing_with_eggs.zip">here</a>, although I hope to fix a few typos and get out better versions soon.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=22</wfw:commentRSS>
		</item>
		<item>
		<title>Tales from Dallas</title>
		<link>http://www.lirico.co.uk/wp/?p=21</link>
		<comments>http://www.lirico.co.uk/wp/?p=21#comments</comments>
		<pubDate>Sat, 24 Feb 2007 01:08:34 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Python</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=21</guid>
		<description><![CDATA[It&#8217;s time to resurrect this bog from the ashes.
Here I am at PyCon2007 in Dallas, seeing face to face many of the
people who&#8217;s names line up in my email browser each morning.  I&#8217;ve
just come out of one on Python web frameworks and now feal slightly
better qualified to evaluate the superfluity of options.
In the lineup [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s time to resurrect this bog from the ashes.</p>
<p>Here I am at PyCon2007 in Dallas, seeing face to face many of the<br />
people who&#8217;s names line up in my email browser each morning.  I&#8217;ve<br />
just come out of one on Python web frameworks and now feal slightly<br />
better qualified to evaluate the superfluity of options.</p>
<p>In the lineup were Django, Pyjamas, Zope, Spyce, Turbogears, Pylons,<br />
CherryPy, Twisted.  Quite a diverse selection.</p>
<p>There&#8217;s some good notes on the session at<br />
<a href="http://www.b-list.org/weblog/2007/02/23/pycon-2007-web-frameworks-panel">http://www.b-list.org/weblog/2007/02/23/pycon-2007-web-frameworks-panel<br />
</a><br />
One jucy comment was &#8220;the plethora of web frameworks seems Perlish and<br />
not Pythonic&#8221;.  This was very well handled by pointing out web<br />
frameworks are a lot less mature than Python and there is still a lot<br />
of experimenting going on.  That&#8217;s a good answer for framework<br />
developers but doesn&#8217;t make choosing between the options any easier.<br />
Still, I think I should give django a try sometime.</p>
<p>Other interesting talks: a mind-blowingly technical talk on iterators &#8212; still exciting though,<br />
Heartening talk about teaching python to school children and a<br />
rallying call to improve Python advocacy.</p>
<p>The good news is that every single sponsor in the lighting talks were<br />
recruiting.  The bad news is that Python development has a supply-side<br />
problem unless it gets the word out about it&#8217;s power.</p>
<p>I hope more is to come but I have a Python for Science BoF to go to.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=21</wfw:commentRSS>
		</item>
		<item>
		<title>cdat_lite is born</title>
		<link>http://www.lirico.co.uk/wp/?p=20</link>
		<comments>http://www.lirico.co.uk/wp/?p=20#comments</comments>
		<pubDate>Thu, 28 Dec 2006 21:34:46 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>NDG</category>
	<category>Python</category>
	<category>Eggs</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=20</guid>
		<description><![CDATA[After much tinkering cdat_lite is ready for a wider audience.  Primarily developed for using as a component of the NERC Data Grid, cdat-lite is a simple repackaging of the i/o layer of the Climate Data Analysis Tools (CDAT) as a Python Egg.
The BADC has found CDAT&#8217;s data management layer (CDMS) invaluable in developing server-side [...]]]></description>
			<content:encoded><![CDATA[<p>After much tinkering <a href="http://proj.badc.rl.ac.uk/ndg/wiki/CdatLite">cdat_lite</a> is ready for a wider audience.  Primarily developed for using as a component of the <a href="http://ndg.nerc.ac.uk">NERC Data Grid</a>, cdat-lite is a simple repackaging of the i/o layer of the <a href="http://www-pcmdi.llnl.gov/software-portal/cdat">Climate Data Analysis Tools (CDAT)</a> as a <a href="http://peak.telecommunity.com/DevCenter/PythonEggs">Python Egg</a>.</p>
<p>The <a href="http://www.badc.rl.ac.uk">BADC</a> has found CDAT&#8217;s data management layer (CDMS) invaluable in developing server-side analysis tools.  It handles the sorts of calendars only found in numerical modelling, abstracts away NetCDF coordinate variables on a veriety of grids and allows aggregation of huge multi-file datasets into a logical dataset.  We like it so much that we developed an input layer for the UK Met. Office PP format.</p>
<p>However CDAT aspires to be much more than this.  It is a comprehensive data analysis environment with a GUI and visualisation components and as such has grown into rather a heafty package.  It&#8217;s a 160Mb download from sourceforge including, among other things, it&#8217;s own Python tarball.  This can be rather inconvenient if you have your own personalised Python environment.  Non-default installation can be tricky and time consuming &#8212; not too much of a problem when setting up a single workstation but more arduous at a place such as the BADC where we have quite a heterogeneous network of Linux systems.  Add to this an evolving codebase, as the cdunifpp component has matured, and you have a recipe for multiple installations, most of which are out of date.</p>
<p>cdat_lite tries to fix this by taking out the bits of CDAT we find most useful &#8212; the <tt>libcdms</tt> I/O layer and the core python packages <tt>cdms, cdutil, cdtime, unidata, genutil, regrid and xmgrace</tt></p>
<p>The first cut of cdat_lite was relatively easy to create.  To eggify the CDAT packages only required a new <tt>setup.py</tt> script and a single patch to <tt>unidata</tt> to load a datafile with the <tt>pkg_resources</tt> API.  Similarly <tt>libcdms</tt> had it&#8217;s own <tt>configure</tt> script and <tt>Makefile</tt> that could be called from <tt>setup.py</tt>.  I could build binary eggs for x86_64 and i686 from within my sandbox that seemed to work once deployed. The fun came when I became fussier about how easy it should be to install.  I wanted the tarball to work with <tt>easy_install</tt> as well as the pre-built eggs.  I wanted this to work on as many machines at the BADC as possible.  In particular RedHat Enterprise, SUSE 10 i686 and x86_84 and kubuntu i686.  This opened up several windy roads.</p>
<h3>Dependencies 1: Numeric</h3>
<p>CDAT needs Numeric.  It needs the Numeric header files to compile libcdms.  Numeric is quite often present on the target python installation but not always.  When it is installed it may or may not include the Numeric headers and probably doesn&#8217;t have an <tt>EGG-INFO</tt> directory (therefore it isn&#8217;t detected by setuptools).  Numeric isn&#8217;t on the cheeseshop and isn&#8217;t easily downloadable from sourceforge because it a legacy package.</p>
<p>At first I just said you needed Numeric before you begin but this didn&#8217;t seem a very good advert for the ease of using eggs.  If Joe user had a Numeric installation without the header files he&#8217;d be stuck.  In the end I decided to mirror the Numeric tarball on the cdat_lite site and include it as a dedendency.  This means Numeric will be downloaded and built automatically be <tt>easy_install</tt>. You might end up installing Numeric twice (one egg, one not) but you&#8217;ll always have the headers.</p>
<p>Sorting this out was a big learning experience for me in how to use setuptools or, more precisely, how to extend distutils in a setuptools compatible way.  The location of the Numeric headers can be determined by:</p>
<blockquote>
<pre>>>> from Numeric_headers import get_numeric_include
>>> get_numeric_include()
'/usr/lib/python2.4/site-packages/Numeric-24.2-py2.4-linux-i686.egg/Numeric_headers'</pre>
</blockquote>
<p>If Numeric isn&#8217;t installed you can tell setuptools to install it with the <tt>setup_requires</tt> keyword.  The problem is that in this case Numeric will be installed from within <tt>setup</tt>. Therefore, you can&#8217;t import Numeric_headers in <tt>setup.py</tt> because it might not be installed yet.The solution is to subclass <tt>setuptools.command.build_ext</tt>, adding <tt>get_numeric_include()</tt> to <tt>self.include_dirs</tt>. This way you can still add your own <tt>include_dirs</tt> on the command line or in a configuration file.  It&#8217;s a pity it&#8217;s so difficult to work out how to do this without reading the distutils source.</p>
<h3>Dependencies 2: NetCDF</h3>
<p>Again netcdf is usually installed but not always.  An added complication is that on x86_64 the library must be compiled with the <tt>-fPIC</tt> options for it to be usable within a DSO.  I discovered that on some of our x86_64 machines we had a netcdf installation that seemed fine (libnetcdf.a, ncdump and ncgen present) but it was useless for building python extension modules.</p>
<p>Clearly there was a case for cdat_lite building it&#8217;s own libnetcdf.a. The current version includes the netcdf tarball although it could download it from unidata in the future.  When compiled, the egg includes it&#8217;s own copy of the netcdf libraries and headers.  This will allow future eggified CDAT components (vcs_lite is in the works) to compile and link against a consistent netcdf installation.</p>
<h3>x86_64 compatability</h3>
<p>There is a cautionary tale about the virtue of unit tests here.  I was quite pleased with how easy it was to compile a x86_64 egg.  The modules imported fine and I tested reading a PP file.  I publicised cdat_lite at the BADC and moved on.  When I returned to polish the code I built a hand full of trivial unit tests, including reading a NetCDF file.  I then discovered that it couldn&#8217;t read NetCDF on x86_64!</p>
<p>It turns out that CDAT 4.1 isn&#8217;t compatible with x86_64. There is an important patch in the CDAT SVN which was obviously going to fix this so I merged in the SVN trunk.  cdat_lite is now based on a particular revision of the CDAT SVN.  This isn&#8217;t ideal but is better than not having NetCDF.</p>
<h3>Future</h3>
<p>I hope <tt>vcs_lite</tt> will be ready for registering at the cheeseshop soon.  It will provide the vcs canvas and hardcopy output without any Tk widgets or VCDAT.  This is all we need to building web applications based on CDAT.</p>
<p>It would be fun to try and write a <a href="http://www.pydap.org">pydap</a> plugin on top of of cdat_lite.  It looks easy enough (famous last words &#8230;)
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=20</wfw:commentRSS>
		</item>
		<item>
		<title>Off to Dallas</title>
		<link>http://www.lirico.co.uk/wp/?p=19</link>
		<comments>http://www.lirico.co.uk/wp/?p=19#comments</comments>
		<pubDate>Fri, 01 Dec 2006 14:46:59 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Python</category>
	<category>Eggs</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=19</guid>
		<description><![CDATA[It looks like I'll be taking a trip across the pond to go to PyCon TX 2007.  I'll be giving two talks on Python Eggs.]]></description>
			<content:encoded><![CDATA[<p>It was a great piece of chance that lead me to submit a couple of talk proposals to <a href="http://us.pycon.org">Pycon TX 2007</a>.  I&#8217;ve been working on using Python Eggs for <a href="http://ndg.nerc.ac.uk/">NDG</a> and one of my collegues noticed a request on the python mailing list for someone to give an Eggs talk.  What better way to consolidate my knowledge of the technology than to present on it?</p>
<p>Those visitors from the PyCon website might like to pick up a recent presentation on Eggs I gave to my department at the Rutherford Appleton Laboratory: see <a href="http://proj.badc.rl.ac.uk/ndg/wiki/PythonEggs">this wiki page</a>.  My audience was a broad mixture of scientific software developers (some in Python, some not) and members of the BADC who need to know what direction NDG is going.  For PyCon I&#8217;ll have to refactor this considerably.  I had originally designed the two talks to be part of a series of 3 but it now looks like there isn&#8217;t an egg-specific advanced talk (although #46 seems to include some eggs material).</p>
<p>Also see the tarball of the demo I gave at the end.  With ez_setup.py and a few lines of shell script I was able to create an OpeNDAP server with WMS and KML output (mainly thanks to <a href="http://www.pydap.org">pydap</a>).  Ending a presentation with Google Earth is always a good bet.</p>
<p>Browsing through the accepted PyCon talks I can see it&#8217;s going to be well worth the trip.  For a start Roberto De Almeida will be giving a talk on pydap and Dr. Travis E Oliphant will be talking about NumPy so there&#8217;s instant overlap with Geophysical applications.   I&#8217;m also looking forward to hearing about IronPython, WSGI and Zope 3.  Finally who could miss Guido&#8217;s key note?</p>
<p>Ah well!  I guess there&#8217;s no escaping becoming a fully fledged Python geek groopie now.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=19</wfw:commentRSS>
		</item>
		<item>
		<title>CML and document vs. database</title>
		<link>http://www.lirico.co.uk/wp/?p=18</link>
		<comments>http://www.lirico.co.uk/wp/?p=18#comments</comments>
		<pubDate>Tue, 31 Oct 2006 21:36:05 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Uncategorized</category>
	<category>Cheminformatics</category>
	<category>MCM</category>
	<category>RDF</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=18</guid>
		<description><![CDATA[In this second post in a series laying out my ideas for the MCM/IUPAC kinetic database integration I&#8217;m going to be taking a look at a suitable data format.
We are dealing with a mixture of symbolic information (chemical structure and reactions), numeric or algebraic information (rate coeeficients, temperature dependent rate expressions, branching ratios) and textual [...]]]></description>
			<content:encoded><![CDATA[<p>In this second post in a series laying out my ideas for the <a href="http://mcm.leeds.ac.uk/MCM">MCM</a>/<a href="http://www.iupac-kinetic.ch.cam.ac.uk/">IUPAC</a> kinetic database integration I&#8217;m going to be taking a look at a suitable data format.</p>
<p>We are dealing with a mixture of symbolic information (chemical structure and reactions), numeric or algebraic information (rate coeeficients, temperature dependent rate expressions, branching ratios) and textual information (references, explanation of experimental details, etc.).  For the chemistry specific symbolic stuff there is, in my view, only one game in town: <a href="http://cml.sf.net/">CML</a>.</p>
<p>I&#8217;ve been keeping half an eye on CML for some time but I am a long way from knowing everything about it, so the following opinions come with the usual caveat of my fallibility.  CML was born well before the world and his aunt jumped on the XML bandwagon.  In keeping with the original vision of XML it is of the &#8220;XML as document&#8221; rather than &#8220;XML as data type&#8221; school, aiming to enable more semantically rich electronic publishing.  In fact CML offers much more than just a representation of species and reactions.  Through it&#8217;s various modular extensions STTML, CMLReact, CMLComp and CMLCM &#8212; to mention just the most mature ones &#8212; CML can capture a lot of the structure of a typical chemistry publication.</p>
<p>This is great news for us, particularly IUPAC where they currently publish data as a set of PDF files.  If desired I expect it would be relatively straightforward to translate all the IUPAC datasheets, in their entirety, into a mixture of CMLCore, CMLReact and STTML.</p>
<p>CML can achieve this by being unashamedly liberal about semantics. The schema (<a href="http://cml.sourceforge.net/schema/cmlCore.xsd">CMLCore</a>, <a href="http://cml.sourceforge.net/schema/STMML.xsd">STMML</a>, <a href="http://cml.sourceforge.net/schema/cmlreact.xsd">CMLReact</a>) are sprinkled with elements documented as <em>&#8220;deliberately very general&#8221;</em> (<code>cml:substance</code>), <em>&#8220;content model is deliberately lax&#8221;</em> (<code>cml:identifier</code>) and <em>&#8220;no controled semantics&#8221;</em> (<code>cml:observation</code>).  This might be worrying if one had to write software to handle all possible constructions but that isn&#8217;t our requirement and there is a clear mechanism for being more restrictive with many elements having a <code>convention</code> attribute (again <em>no controlled vocabulary</em> <img src='http://www.lirico.co.uk/wp/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> ).</p>
<p>It&#8217;s worth noting here that there is a difference between the communication format and the underlying model.  The MCM and IUPAC needn&#8217;t agree on how they store and serve the information provided the format and communication medium is agreed (What software engineers would call the interface).  CML would be a good fit for communicating information from IUPAC to the MCM.  It could also be used as the underlying data model of the IUPAC database and it would have the advantage of being similar to what they have now.  However, I believe CML alone would be quite a restrictive model for the MCM, for here we have an example of where document-centric information falls down.</p>
<p>One could describe the overall structure of the MCM as a forrest of trees starting from a relatively small set of root species (those species thought to be representative of primary VOC emmisions), branching rapidly on each reaction but also overlapping as intermediate species are formed from different pathways.  Finally everything ends up as CO<sub>2</sub> + water.</p>
<p>There is, therefore, only two alternatives if you want to systematically divide the MCM into discreet documents (well, 3 if you include 1 document per reaction).  Either put the whole 4500 species and 12600 reactions in one document or have 4500 documents, one for each reactant.  Although the former is technically feasible, in practice people want to browse and extract subsets of the mechanism.  A relational model is well suited to this as it doesn&#8217;t force document boundaries on the data.  This is what is done at the moment with the MCM website&#8217;s MySQL backend.</p>
<p>However, at the BADC there is a consensus that dataset specific databases are bad news for data curation.  Unlike a document or file you can&#8217;t pass a database around (without SQL compatability headaches), they pose subtle <a href="http://home.badc.rl.ac.uk/lawrence/blog/2005/12/22/data_citation">citation challenges</a> and each instance has a different structure, requiring it&#8217;s own expert to maintain it.</p>
<p>Therefore, for deposition at the BAD at least, we will need a document-centric format for the entire MCM and the challenge is to try and keep all the nice features you get with a database.  Maybe this could be achieved with one big CML document and <a href="http://www.w3.org/TR/xquery/">XQuery</a>?  I know next to nothing about the technology.  I would favour taking the 4500 CML documents and annotating them with <a href="http://www.w3.org/RDF/">RDF</a>.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=18</wfw:commentRSS>
		</item>
		<item>
		<title>&#8220;Getting&#8221; Python Paste</title>
		<link>http://www.lirico.co.uk/wp/?p=16</link>
		<comments>http://www.lirico.co.uk/wp/?p=16#comments</comments>
		<pubDate>Sun, 29 Oct 2006 14:51:31 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Uncategorized</category>
	<category>Python</category>
	<category>Eggs</category>
	<category>FastCGI</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=16</guid>
		<description><![CDATA[OK, my boss is getting excited about FastCGI and I am spending a lot of my time getting my head around Python Eggs. From what I read on the web the missing link is Paste.  So I start googling &#8230;
I have tried this before and have found the information on pythonpaste.org rather mystifying. I [...]]]></description>
			<content:encoded><![CDATA[<p>OK, my boss is <a href="http://home.badc.rl.ac.uk/lawrence/blog/2006/10/26/exploring_web_server_backends_-_installing_fastcgi_and_lighttpd">getting excited</a> about <a href="http://www.fastcgi.com/">FastCGI</a> and I am spending a lot of my time getting my head around <a href="http://peak.telecommunity/DevCenter/PythonEggs">Python Eggs</a>. From what I read on the web the missing link is Paste.  So I start googling &#8230;</p>
<p>I have tried this before and have found the information on <a href="http://pythonpaste.org">pythonpaste.org</a> rather mystifying. I get the idea of a wsgi toolbox but there must be more to justfy the hype.  So this time I looked a little further a field to <a href="http://www.groovie.org/articles/2005/10/04/python-paste-power">this post</a> (google hit no. 3, to be precise).  This looked more interesting.  The <code>paster</code> script appeared to be similar to some of the things Eclipse can do in the Java world: create skeleton projects and deploy web applications.</p>
<p>I new that Paste was heavily into eggs so I decided to start hacking</p>
<blockquote><p><code> $ easy_install paste<br />
$ paster create --list-templates<br />
bash paster: command not found </code></p></blockquote>
<p>Not a good start, and quite the reverse of what I would expect from an egg-enabled package.  Interogating the Paste egg reveals no sign of <code>paster</code>.  I must have the wrong package.  I dig deeper into pythonpaste.org; there are components called <code>python.deploy</code> and <code>python.script</code>.  It could be either of these since <code>paster</code> is a script that deploys stuff.  I install both and we are back on track.Now I&#8217;ve been forced to look closer at pythonpaste.org I begin to realise the front page is a red herring.  The real exciting stuff goes on in Paste Deploy/Script and these are the packages that people usually mean when they evangelise about Paste.</p>
<p><a href="http://pythonpaste.org/deploy/">Paste Deploy</a> is a tool for configuring stacks of WSGI components.  With it you can plug WSGI applications, middleware and servers together using configuration files and Egg entry points.  The Python Deploy docs have an <a href="http://pythonpaste.org/deploy/#id6">excellent example</a> of how this can work to serve multiple apps from the same server and include HTTP authentication into the mix.  This is easily the most impressive use of entry points I have seen.</p>
<p><a href="http://pythonpaste.org/script/">Paste Script</a> (a.k.a. <code>paster</code>) works closely with Deploy.  It allows you to startup a WSGI servers on the command line, including Deploy configurations.  Another neat feature is that <a href="http://pythonpaste.org/script/developer.html#templates">Paste Script Templates</a> can layout a skeleton project for you, a la Eclipse Projects.  Templates are also activated with entry points, therefore framework developers can provide a <code>paste.paster_create_template</code> entry point in their eggs and <code>paster create</code> will create the boilerplate for you.</p>
<p>What I like especially about all this is that it&#8217;s just glue and isn&#8217;t dependent on any one server or framework.  This is, after all, the point of WSGI.  You can deploy your app on Paste&#8217;s wsgi server or use flup&#8217;s FastCGI implementation.  Frameworks such as TurboGears and Django are becoming compatible with <code>paste create</code> (by the <a href="http://svn.w4py.org/Paste/CherryPaste/trunk/docs/index.txt">looks of things</a> there are some issues relating to thread control still to be resolved).</p>
<p>Also the idea of deploying web apps as eggs is very exciting.  It invokes a vision of deploying all BADC web services through a common configuration framework with authentication, etc. handled transparently.  Thus the app developer needn&#8217;t write a single line of authentication code.  Paste definitely seems to bring python closer to that vision.</p>
<p><strong>Stop Press:</strong> And I now discover the SVN version of Paste has a <a href="http://svn.pythonpaste.org/Paste/trunk/paste/auth/open_id.py">wsgi filter</a> for <a href="http://www.openid.net">OpenId</a>!
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=16</wfw:commentRSS>
		</item>
		<item>
		<title>FastCGI Resurgence</title>
		<link>http://www.lirico.co.uk/wp/?p=17</link>
		<comments>http://www.lirico.co.uk/wp/?p=17#comments</comments>
		<pubDate>Sun, 29 Oct 2006 14:45:28 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>FastCGI</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=17</guid>
		<description><![CDATA[I must confess at being rather puzzled that FastCGI is getting so much attention lately.  I remember coming across it in the &#8217;90s during my PhD &#8212; the inefficiency of CGI was an issue then as it is now &#8212; but then trends appeared to be leaving FastCGI behind.  Either you integrated the [...]]]></description>
			<content:encoded><![CDATA[<p>I must confess at being rather puzzled that <a href="http://www.fastcgi.com">FastCGI</a> is getting so much attention lately.  I remember coming across it in the &#8217;90s during my PhD &#8212; the inefficiency of CGI was an issue then as it is now &#8212; but then trends appeared to be leaving FastCGI behind.  Either you integrated the application framework with apache using <code>mod_php/perl/python</code> or you used a specialised application server such as <a href="http://www.zope.org">Zope</a>, putting it behind an apache proxy if necessary.  I chose the former when developing the <a href="http://mcm.leeds.ac.uk/MCM">MCM website</a> and for a time I did battle with Zope&#8217;s many complex features.  What has changed now to bring FastCGI to the fore?</p>
<p>I should have had an inkling of the reason from an experience I had a year or so ago whilst developing an application for the <a href="http://www.climateprediction.net">Climateprediction.net</a> project.  I chose to use mod_python but their site used the mod_php.  It transpired that when using the distro.&#8217;s packages these two modules didn&#8217;t sit well together.  They both tried to link to different versions of libxml and uggly core dumps followed.  I fixed it by recompiling mod_php but it was hardly a solution one would want to repeat.</p>
<p>Bryan Lawrence has brought together some insightful wisdom on the matter <a href="http://home.badc.rl.ac.uk/lawrence/blog/2006/07/05/whither_our_web_servers_-_part_i_">here</a> and <a href="http://home.badc.rl.ac.uk/lawrence/blog/2006/07/10/whither_our_web_servers_-_part_ii">here</a>. I&#8217;m embarassed to admit it&#8217;s taken me so long to read them.  Also the <a href="http://en.wikipedia.org/wiki/FastCGI">Wikipedia entry</a> for FastCGI explain&#8217;s it&#8217;s resugrence.  In a world of a million and one web frameworks it isn&#8217;t feasible to re-implement a webserver for each one, let alone squeeze them all into one server process with <code>mod_*</code>.</p>
<p>It all helps to reassure me that my instincts were right when I began to go cold on Zope.  It is a great platform to work with once you know how but it&#8217;s great weakness is that it reinvents everything at once: object publisher, templates, authentication, permissions, the lot.  It makes for an enormous learning curve and then people build more complexity on top with <a href="http://www.plone.org">Plone</a> and the like. It&#8217;s just not very agile.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=17</wfw:commentRSS>
		</item>
		<item>
		<title>An exploration of OS cheminformatics tools</title>
		<link>http://www.lirico.co.uk/wp/?p=14</link>
		<comments>http://www.lirico.co.uk/wp/?p=14#comments</comments>
		<pubDate>Thu, 26 Oct 2006 21:46:13 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Cheminformatics</category>
	<category>MCM</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=14</guid>
		<description><![CDATA[In November I&#8217;m starting a new project in collaboration with Cambridge and Leeds Universities to link the information of the MCM website and the IUPAC Chemical Kinetics database.  To achieve our aims we are going to need to improve the cheminformatics tools underlying these sites, so I have been reviewing what options we have [...]]]></description>
			<content:encoded><![CDATA[<p>In November I&#8217;m starting a new project in collaboration with Cambridge and Leeds Universities to link the information of the <a href="http://mcm.leeds.ac.uk/MCM">MCM</a> website and the <a href="http://www.iupac-kinetic.ch.cam.ac.uk/">IUPAC Chemical Kinetics database</a>.  To achieve our aims we are going to need to improve the cheminformatics tools underlying these sites, so I have been reviewing what options we have for using OpenSource products.</p>
<p>Casting about for promising OS projects I was lead to the <a href="http://www.blueobelisk.org">Blue Obelisk</a> website &#8212; a gathering of chemical informaticians working on various projects two of which seem to stand out from the crowd: <a href="http://openbabel.sf.net/">OpenBabel</a> and the <a href="http://cdk.sf.net/">Chemistry Development Kit</a>.</p>
<p>These two projects clearly have a thriving developer community and a lot going for them.  In my experimenting I had cause to contact developers from both projects and was delighted to get quick constructive feedback (more on this later).</p>
<p>OpenBabel is written in C++ with <a href="http://www.swig.org/">SWIG</a> based bindings to Python, Perl and Ruby.  As it&#8217;s name suggests it concentrates on translating between chemical file formats, however the API provides a good general cheminformatics toolbox.  CDK by contrast aims to provide a toolkit to underpin interactive tools such as <a href="http://jchempaint.sf.net/">JChemPaint</a> and <a href="http://jmol.sf.net/">Jmol</a> and is written in Java.  The range of features covered by CDK are much broader than OpenBabel but quite a lot of them appear to be work in progress.</p>
<p>The common field between the MCM and IUPAC is atmospheric chemical kinetics and therefore central to our requirements is the representation of radical species.  Perhaps not surprisingly in a field dominated by pharmaceuticals and biochemistry, cheminformatics tools have tended to do a poor job of representing radicals.  As a starting point I wanted to see how OpenBabel and CDK handled radical input, particularly in SMILES and MDL Molfile formats since the MCM already uses these formats.</p>
<p>SMILES is a particular problem here because it has never officially supported radicals.  The MCM has used the <a href="http://www.accelrys.com/products/accord/index.html">Accord</a> Excel &#038; Access plugins which use an extension where &#8220;[C.]&#8221; signifies a carbon radical centre.  So how did the two toolkits match up?</p>
<h2>OpenBabel</h2>
<p>Browsing the OpenBabel wiki turned up <a href="http://openbabel.sourceforge.net/wiki/Radicals_and_SMILES_extensions">this page</a> showing that the OpenBabel developers are very aware of the problem.  Although OpenBabel supports &#8220;[C.]&#8221; etc. as input, it always generates SMILES radicals using explicit hydrogens.  E.g.</p>
<blockquote>
<pre>$ echo "CC[C.]" | babel -ismi -osmi
CC[CH2]
$ echo "CC[CH]C" | babel -ismi -osmi
CC[CH]C</pre>
</blockquote>
<p>This form is unambiguous and was also recognised by a couple of non-opensource toolkits with accademic licences (<a href="http://www.chemaxon.com/demos/try_marvin.html">Marvin</a> and <a href="http://www.xemistry.com/academic">CACTVS</a>).  OpenBabel handled the &#8220;RAD&#8221; property of MDL molfiles successfuly so I thought it had everything we needed.  There were a couple of hitches yet to uncover but before I go into them let&#8217;s turn to CDK.</p>
<h2>CDK</h2>
<p>I&#8217;m not a Java developer but with the languages&#8217; dominance in many areas I&#8217;m always happy to find an oportunity to learn the language properly.  For my test I used <a href="http://www.lirico.co.uk/wp/jpype.sourceforge.net">JPype</a> as a familiar python shell to play with the CDK API.  Even though I&#8217;d already taken a liking to OpenBabel I wanted to give CDK a good test because there was one feature CDK has that OpenBabel doesn&#8217;t: molecule depiction, that is generating 2D coordinates and a resulting image for chemical structures.</p>
<p>Unfortunately CDK&#8217;s radical support is less complete.  It rejected the &#8220;[C.]&#8221; SMILES extension and added hydrogens rather than radicals when fed hydrogen deficient SMILES atoms.  It did recognise radicals in MDL molfiles the CML <code>spinMultiplicity</code> attribute, producing a reasonably rendered image of a cyclic radical.</p>
<p>Since SMILES radical support is so important to us I submitted a bug report on sourceforge and got a quick response pointing out that SMILES doesn&#8217;t support radicals at all.  This exposes&#8217; SMILES&#8217; great weakness.  SMILES is a proprietary format of the <a href="http://www.daylight.com">Daylight Corporation</a> and to my knowledge has never had a published standard, therefore different implementers have extended SMILES to support radicals in different ways.  You can&#8217;t blame CDK for concentrating on compliance with the daylight toolkit but I hope opensource tools can converge on this issue.  To my mind OpenBabel&#8217;s approach of using hydrogen-deficiency is the way forward.</p>
<p>It looks like we would be able to use CDK to depict species by passing molfiles or CML to it.  In this way we could use OpenBabel and CDK together to do what we want.</p>
<h2>Returning to OpenBabel</h2>
<p>I was having a lot of success using the Python bindings to OpenBabel until I tried a peroxy radical.  It&#8217;s easiest to illustrate the problem on the command-line:</p>
<blockquote>
<pre>$ echo "CCO[O]" | babel -ismi -osmi
CCOO</pre>
</blockquote>
<p>Here OpenBabel doesn&#8217;t appear to convert the hydrogen deficient oxygen atom into a radical.  This problem is exposed further by feeding babel output back to babel:</p>
<blockquote>
<pre>$ echo "CCO[O.]" | babel -ismi -osmi | babel -ismi -osmi
CCOO</pre>
</blockquote>
<p>Disaster! babel forgets the radical centre on successive conversions. However, there is a happy ending.  I have learned never to criticise an opensource project until you have tested the bleeding edge version, so I checked out the development release from the sourceforge SVN. With relief I discovered the bug had been fixed:</p>
<blockquote>
<pre>$ echo "CCO[O.]" | babel -ismi -osmi | babel -ismi -osmi
CCO[O]</pre>
</blockquote>
<h2>The sting in the tail</h2>
<p>Openbabel was to deal me one last surprise.  The Python bindings on the SVN version failed to compile.  I sent a message to openbabel-scripting and got a response within minutes (some of the developers must be in Europe).  This is a very good sign for anyone thinking of starting critical development work with an opensource codebase.  A small patch and everything was fine.  By the time you read this it may well be in SVN.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=14</wfw:commentRSS>
		</item>
		<item>
		<title>Expert systems in Python</title>
		<link>http://www.lirico.co.uk/wp/?p=13</link>
		<comments>http://www.lirico.co.uk/wp/?p=13#comments</comments>
		<pubDate>Fri, 20 Oct 2006 15:36:49 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>Python</category>
	<category>Cheminformatics</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=13</guid>
		<description><![CDATA[Many years ago now my PhD was all about building an expert system to deduce chemical mechanisms for reactions in the troposphere.  The system, uninspiringly called MechGen, worked up to a point but was always pretty hairy underneath.  Ever since that time I have always kept my
eye open for software components that might [...]]]></description>
			<content:encoded><![CDATA[<p>Many years ago now my PhD was all about building an expert system to deduce chemical mechanisms for reactions in the troposphere.  The system, uninspiringly called MechGen, worked up to a point but was always pretty hairy underneath.  Ever since that time I have always kept my<br />
eye open for software components that might work a bit better.</p>
<p>One part of the jigsaw might be <a href="http://openbabel.sourceforge.net/">openbabel</a>, a cheminformatics toolkit with a Python wrapper, but I have always drawn a blank in finding expert system tools that integrate with Python &#8212; until now.</p>
<p><a href="http://pyclips.sourceforge.net/">PyCLIPS</a> is an interface between python and the venerable CLIPS expert system shell.  For a relatively young opensource project it looks very impressive.  The documentation is very detailed and it&#8217;s distributed as an Egg therefore I was able to test it out very quickly.</p>
<p>I can see an oportunity to rekindle my interest in expert systems.  Maybe one day there will even be a Mechen2?
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=13</wfw:commentRSS>
		</item>
		<item>
		<title>Matplotlib with Eggs</title>
		<link>http://www.lirico.co.uk/wp/?p=11</link>
		<comments>http://www.lirico.co.uk/wp/?p=11#comments</comments>
		<pubDate>Fri, 20 Oct 2006 15:19:09 +0000</pubDate>
		<dc:creator>stephen.pascoe</dc:creator>
		
	<category>NDG</category>
	<category>Python</category>
	<category>Eggs</category>
		<guid isPermaLink="false">http://www.lirico.co.uk/wp/?p=11</guid>
		<description><![CDATA[I am really beginning to get excited about Python Eggs, from both a developer&#8217;s and installer&#8217;s perspective.  I&#8217;ve been investigating packaging various parts of NDG as eggs and writing a fair amount of setuptools code recently.
Whilst I was burried in some of this code it occurred to me that eggs would allow me to [...]]]></description>
			<content:encoded><![CDATA[<p>I am really beginning to get excited about <a href="http://peak.telecommunity.com/DevCenter/PythonEggs">Python Eggs</a>, from both a developer&#8217;s and installer&#8217;s perspective.  I&#8217;ve been investigating packaging various parts of <a href="http://proj.badc.rl.ac.uk/ndg">NDG</a> as eggs and writing a fair amount of <a href="http://peak.telecommunity.com/DevCenter/setuptools">setuptools</a> code recently.<br />
Whilst I was burried in some of this code it occurred to me that eggs would allow me to experiment with various Python packages I never get arround to trying.  A prime example is scientific packages.  We are using Numeric and cdat in NDG but there&#8217;s several alternatives out there for data access , analysis and visualisation.</p>
<p>As a test I decided to see how quick and easy it would be to install these components using easy_install:</p>
<ol>
<li>numpy (note: NOT Numeric)</li>
<li>matplotlib.  For visualisation</li>
<li>basemap.  A toolkit package for matplotlib which adds map projection support.</li>
<li>pycdf.  A libnetcdf.a wrapper.</li>
</ol>
<p>None of these projects distribute eggs and they all require C extension compilation so I wasn&#8217;t expecting it to work out of the box.  Although I was right about that, easy_install got surprisingly far.</p>
<p>My test system was a linux-i686 Suse10.0 machine.  Using the standard Python I checked distutils was installed, downloaded ez_setup.py and created a local setuptools installation directory on my PYTHONPATH.</p>
<p><code>$ easy_install numpy</code> just worked as did <code>$ easy_install matplotlib</code>.  The link from the Python cheeseshop to pycdf was broken so I had to download the tarball by hand.  Once this was done <code>$ easy_install -f . pycdf</code> also worked thanks to libnetcdf.a being in a standard place.  This was truely better than I had expected.</p>
<p>Unfortunately basemap wasn&#8217;t so helpful.  Compilation went fine but I couldn&#8217;t import the module.  Since this package contains data files I wasn&#8217;t surprised that it needed installing with <code>$ easy_install --always-unzip</code> and the BASEMAP_DATA_PATH environment variable needed setting but there was something else wrong.</p>
<p>Here my emersion in the setuptools system came to my rescue.  The basemap package is designed to sit inside matplolib&#8217;s heirarchy.  When two components are sitting in different eggs want to appear as a merged package tree they need to use a feature of setuptools called namespace_packages.</p>
<p>I dived into the basemap code and discovered it had been written with optional support for setuptools.  However, getting namespace_packages working requires a very precise code layout and basemap had got it wrong.  Once the problem was identified fixing it was straightforward:</p>
<blockquote>
<pre>$ pushd /matplotlib
$ cp toolkits/__init__.py .
$ popd</pre>
</blockquote>
<p>Now everything worked and I was able to create one of the standard basemap examples.</p>
<p><img id="image12" alt="matplotlib/basemap example" style="width: 90%" src="http://www.lirico.co.uk/wp/wp-content/uploads/2006/10/image.png" /></p>
<p>It&#8217;s a pity basemap falls down on this small point and I hope it is fixed in the future.  I know installing matplotlib can be a bit daunting and using easy_install definitely improved matters.  It makes the prospect of deploying this software as part of a application server much more appealing.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://www.lirico.co.uk/wp/?feed=rss2&amp;p=11</wfw:commentRSS>
		</item>
	</channel>
</rss>
