solving DESeq2 installation issues

At work a colleague asked me to do a system-wide installation of the R module DESeq2 in one of our internal servers.
The installation procedure is quite straight-forward:

source("http://bioconductor.org/biocLite.R")
biocLite("DESeq2")

Unfortunately I had some issues on my system, in fact I got:

…
Warning in fun(libname, pkgname) :
couldn't connect to display "localhost:12.0"
* DONE (maSigPro)

The downloaded source packages are in
    ‘/tmp/RtmpfdD2RC/downloaded_packages’
Warning messages:
1: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘XML’ had non-zero exit status
2: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘annotate’ had non-zero exit status
3: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘genefilter’ had non-zero exit status
4: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘geneplotter’ had non-zero exit status
5: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘DESeq2’ had non-zero exit status

I then tried to install manually the various dependencies, like XML. Still no luck. After a quick Google search I found that I was missing a couple of -dev packages on my Ubuntu machine, so I installed them:

root@server:~# apt-get install libcurl4-openssl-dev libxml2-dev

… and then re-tried to install DESeq2.
This time everything was ok! Problem solved!

The homogenization of scientific computing, or why Python is steadily eating other languages’ lunch

Speaking only for myself, I’ve now arrived at the point where around 90 – 95% of what I do can be done comfortably in Python. So the major consideration for me, when determining what language to use for a new project, has shifted from what’s the best tool for the job that I’m willing to learn and/or tolerate using? to is there really no way to do this in Python? By and large, this mentality is a good thing, though I won’t deny that it occasionally has its downsides. For example, back when I did most of my data analysis in R, I would frequently play around with random statistics packages just to see what they did. I don’t do that much any more, because the pain of having to refresh my R knowledge and deal with that thing again usually outweighs the perceived benefits of aimless statistical exploration.

Conversely, sometimes I end up using Python packages that I don’t like quite as much as comparable packages in other languages, simply for the sake of preserving language purity. For example, I prefer Rails’ ActiveRecord ORM to the much more explicit SQLAlchemy ORM for Python–but I don’t prefer to it enough to justify mixing Ruby and Python objects in the same application. So, clearly, there are costs. But they’re pretty small costs, and for me personally, the scales have now clearly tipped in favor of using Python for almost everything. I know many other researchers who’ve had the same experience, and I don’t think it’s entirely unfair to suggest that, at this point, Python has become the de facto language of scientific computing in many domains. If you’re reading this and haven’t had much prior exposure to Python, now’s a great time to come on board!

Tal Yarkoni ☞ [citation needed]

installare RMySQL su CentOS

… una veloce nota per chi, come me, si ritrova a dover installare il modulo RMySQL su un server CentOS (o RHEL).

Chi mi segue da tempo potrà ricordare che sui miei server R viene installato attraverso l’utilizzo del repository EPEL repository (come ho descritto in questo post del blog).

Premesso ciò passiamo all’installazione vera e propria. Bisogna puntare il browser sul sito di RMySQL e scaricare entrambi i pacchetti RMySQL_0.7-5.tar.gz e DBI_0.2-5.tar.gz. Ho dovuto proseguire in questo modo Perché il pacchetto R-DBI presente nei repository attivati nel mio server generava dei problemi di dipendenze/versioni non previsti.

Quindi – dovendo installare il modulo in modo che fosse disponibile a tutti gli utenti del sistema – come utente root ho impartito i seguenti comandi:

[root@machine ~]# R CMD INSTALL DBI_0.2-5.tar.gz
[root@machine ~]# R CMD INSTALL RMySQL_0.7-5.tar.gz

A questo punto all’utente non è rimasto che avviare il suo ‘ambiente‘ di sviluppo in R e controllare che il nuovo modulo fosse effettivamente attivo e disponibile. Happy coding!

come installare R su CentOS

CentOS, logo Argomento poco natalizio quest’oggi dove vado a postare un piccolo “how-to” per installare la suite di calcolo statistico (open source) e i suoi strumenti di sviluppo R.

La distribuzione CentOS non prevede nei suoi repository ufficiali alcuna build di questo progetto che, fortunatamente, fornisce invece i pacchetti per le più diffuse piattaforme GNU/Linux (e Windows e Mac OS X) all’interno dei suoi mirror.

Sono presenti gli rpm binari dei pacchetti, ma vista la disponibilità di medatati yum (la directory repodata) è possibile creare un file .repo per un uso più snello e integrato con il package manager di sistema YUM.

Questo che segue è il file che mi sono creato:

[R - Project for Statistical Computing]
name=R repository
baseurl=http://rm.mirror.garr.it/mirrors/CRAN/bin/linux/redhat/el5/i386
failovermethod=priority
enabled=1
gpgcheck=0
priority=15

Una nota sul gpgcheck=0 che ho messo in quanto i pacchetti non sono firmati (o almeno io non sono riuscito a trovarla) e sul priority=15 dovuti all’uso del tool yum-priorities.

Se avete suggerimenti o migliorie da apportare i commenti al post son qui per questo !