Over the past few weeks, I’ve been following a really great blog by Carl Vogel. This blog has an excellent (growing) collection of Python examples based on porting code and examples from R to Python. In general, it is useful for those “interested in the Python data analysis toolkit and its viability as an alternative to R”. Carl draws on examples from Machine Learning for Hackers by Drew Conway and John Miles White, as well as Gelman and Hill’s Data Analysis Using Regression and Multilevel/Hierarchical Models.
- Tue 12 February 2013 cfarmer
- Sat 06 October 2012 cfarmer
Unfortunately, I haven’t had much time recently to update or work on
manageR, but I’m hoping that will change in the next few months… Having said that, there are quite a few people out there that have been having trouble installing
manageR(and the required
rpy2) on their system to get things working at all! I have had some individuals provide possible fixes and suggestions on how to get things working properly on various platforms, and I’m going to use this post to amalgamate them, and hopefully create a one stop post for all your
manageRneeds. I’m also hoping that people will post potential fixes in the comments to help others with more specific problems?
- Fri 30 March 2012 cfarmer
Its been quite a while since my last post, and its Friday and I was feeling creative, so I decided to map something! I’ve been looking for an excuse to produce a nice graphic like the one Anita Graser created to represent Vienna’s green-spaces. She used Quantum GIS to produce a hexagonal grid for representing the density of Viennese trees instead of the standard heat map or kernel density map, and the results are quite nice! I’m a huge fan of QGIS, but I tend to do most of my work in R, so I decided to ...
- Wed 09 November 2011 cfarmer
Well its been a long time since my last post, but I do have a relatively good reason: I was finishing up my PhD thesis. The good news is that I’m now done and graduated! I’m hoping I’ll have a bit more time to blog and continue working on side-projects that I had to put on-hold while finishing up. My plan for the next few months is to finish up here in Maynooth, (unofficially) start some post-doc work, and finish/get going on several papers on my PhD research. I’m also going to try to learn Bayesian statistics, fiddle about with some visualizations I’ve been working on, and start getting back into QGIS and Python development again
In the mean time, I’ve put together a fun little visualization of my PhD thesis in the form of a word-cloud.
- Sat 06 November 2010 cfarmer
My two favorite scientific programming languages are Python and R, each for their own specific strengths. I stick with R for most of my serious stats stuff, but for everyday processing, analysis, and GUI building, Python is my modus operandi. Lately however, I’ve been doing more and more things in Python… even the stats stuff. When doing statistical analysis in Python, I usually use the excellent rpy2 library to communicate between Python and R. As a result, I have put together quite a few little code snippets to work with R commands in Python. Recently, I decided to put ...
- Thu 23 September 2010 cfarmer
Data visualization is part of my everyday work-flow. More often than not, I’m playing around with my data in a GIS to tease out interesting or informative spatial patterns, or to ensure that I’m getting the results that I’m expecting. As a result, I am constantly trying out different classification schemes to help me generalize spatial patterns, highlight outliers and/or patterns, or just plain mess around with my data.
- Wed 21 April 2010 cfarmer
In a recent post, I mentioned that I was testing the stability of clusters generated from a modified network partitioning algorithm using bootstrap resampling techniques. I also mentioned that I was doing this in R, using the very nice foreach package published by REvolution Computing. To show just how nice this package is, below is a minimal example of bootstrapping a network partitioning algorithm which takes advantage of a multicore processor:
- Sat 17 April 2010 cfarmer
My PhD research at the moment focuses on network-based algorithms for delineating functional regions (geographical regions within which a large majority of the local population seeks employment, and the majority of local employers recruit their labour). Currently I’m using a network partitioning algorithm based on modularity maximisation. I have found my results to be quite good so far, but, ‘quite good’ isn’t really a very scientific description of validity, so obviously some others means of validation is required. Enter bootstrap resampling!
1 / 2