Locked History Actions

Events/GCC2012/TrainingDay/WS5

Workshop 5: Installing Your Own Galaxy

Documentation for all of these features is at Admin/Config/Performance/ProductionServer.

Create a new user for Galaxy

   1 trainingday@trainingday:~$ sudo -i
   2 [sudo] password for trainingday: 12345
   3 root@trainingday:~# useradd -m -s /bin/bash galaxy
   4 root@trainingday:~# 

Install Mercurial

   1 root@trainingday:~# apt-get install mercurial
   2 Reading package lists... Done
   3 Building dependency tree       
   4 Reading state information... Done
   5 The following extra packages will be installed:
   6   mercurial-common
   7 Suggested packages:
   8   qct wish kdiff3 tkdiff meld xxdiff python-mysqldb python-pygments
   9 The following NEW packages will be installed:
  10   mercurial mercurial-common
  11 0 upgraded, 2 newly installed, 0 to remove and 24 not upgraded.
  12 Need to get 0 B/1,982 kB of archives.
  13 After this operation, 6,691 kB of additional disk space will be used.
  14 Do you want to continue [Y/n]? 
  15 Selecting previously unselected package mercurial-common.
  16 (Reading database ... 60146 files and directories currently installed.)
  17 Unpacking mercurial-common (from .../mercurial-common_2.0.2-1ubuntu1_all.deb) ...
  18 Selecting previously unselected package mercurial.
  19 Unpacking mercurial (from .../mercurial_2.0.2-1ubuntu1_i386.deb) ...
  20 Processing triggers for man-db ...
  21 Setting up mercurial-common (2.0.2-1ubuntu1) ...
  22 Setting up mercurial (2.0.2-1ubuntu1) ...
  23 
  24 Creating config file /etc/mercurial/hgrc.d/hgext.rc with new version
  25 root@trainingday:~# 

Clone the Galaxy Distribution

   1 root@trainingday:~# su - galaxy
   2 galaxy@trainingday:~$ hg clone https://bitbucket.org/galaxy/galaxy-dist/
   3 destination directory: galaxy-dist
   4 requesting all changes
   5 adding changesets
   6 adding manifests
   7 adding file changes
   8 added 7405 changesets with 28970 changes to 5975 files
   9 updating to branch default
  10 3922 files updated, 0 files merged, 0 files removed, 0 files unresolved
  11 galaxy@trainingday:~$ cd galaxy-dist
  12 galaxy@trainingday:~/galaxy-dist$ 

If you were following along in the workshop, we cloned from ~trainingday/galaxy-dist instead of https://bitbucket.org/galaxy/galaxy-dist/ for performance reasons.  You can change the default location that will be used for pulling updates by changing the default URL in ~galaxy/galaxy-dist/.hg/hgrc

Configure Galaxy

   1 galaxy@trainingday:~/galaxy-dist$ cp universe_wsgi.ini.sample universe_wsgi.ini
   2 galaxy@trainingday:~/galaxy-dist$ vim universe_wsgi.ini

I changed the following settings:

  • database_connection = postgres:///galaxy?host=/var/run/postgresql - Use a PostgreSQL database via a local UNIX domain socket (the socket is in /var/run/postgresql).  Details on this URL syntax are at Admin/Config/Performance/ProductionServer under the "Switching to a database server" section.

  • database_engine_option_server_side_cursors = True - Keep large SQL query results on the PostgreSQL server, rather the transferring the entire result set to the Galaxy process.

  • database_engine_option_strategy = threadlocal - Only use one database connection per thread.

  • tool_dependency_dir = /home/galaxy/tool-deps - The directory that will house tool dependencies.  Admin/Config/Tool Dependencies explains how these dependencies can be configured.  Tools installed from the tool shed that manage their own dependencies (e.g. freebayes) will also use this directory.

  • debug = False - Disables debugging middleware that loads server responses in to memory (can crash the server when handling large files).

  • use_interactive = False - Disables live client browser debugging (insecure).

  • library_import_dir = /home/galaxy/import - Administrators can directly import datasets from this directory on the server to Data Libraries.  This includes an option that allows an effective "symlink" to the data, rather than copying it in to Galaxy's file_path directory.  Documented at Admin/DataLibraries/UploadingLibraryFiles.

  • user_library_import_dir = /home/galaxy/user-import - Non-administrators can directly import datasets from this directory on this server to Data Libraries from which they have been given write permission.  Documented at the same link as above.

  • allow_library_path_paste = True - Administrators can import datasets from anywhere on the server's filesystem(s) by entering their paths in to a textarea.

  • id_secret = <random text> - Ensures that the encoded IDs used by Galaxy (especially session IDs) are unique.  One simple way to generate a value for this is with a shell command like % date | md5sum

  • use_remote_user and remote_user_maildomain - I did not enable these, but this is how users can use your institution's existing authentication system to log in to Galaxy.  Documentation is specific to Admin/Config/Apache Proxy or Admin/Config/Performance/nginx Proxy.

  • admin_users = nate@example.org - Make nate@example.org an administrator.  Galaxy's Admin UI is only accessible if you define administrators here!

  • allow_user_impersonation = True - Users configured as administrators (with admin_users) can "become" other users to view Galaxy exactly as the impersonated user does.  Useful for providing support.

  • allow_user_dataset_purge = True - Allow users to forcibly remove their datasets from disk (note that the data is only actually removed if all versions of a shared dataset are purged by all users who are sharing the dataset).  By default, Galaxy does not remove data, as this is done at a later time by the dataset cleanup scripts (discussed below).

  • enable_quotas = True - Enable Galaxy's quota system.  Quotas are configured by administrators in the Galaxy Admin UI.

  • set_metadata_externally = True - Galaxy must detect certain attributes about the outputs of a tool after the tool has finished running, and store these attributes as metadata.  These include things like the number of reads (for fasta/fastq), column types (for tabular data) and so forth.  This process can be very CPU intensive for large files and will lock up the Galaxy server process.  set_metadata_externally causes this step to happen in a separate process (and if the tool ran on a cluster, it happens on the cluster).

Install PostgreSQL

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:~# apt-get install postgresql
   4 Reading package lists... Done
   5 Building dependency tree       
   6 Reading state information... Done
   7 The following extra packages will be installed:
   8   libpq5 postgresql-9.1 postgresql-client-9.1 postgresql-client-common
   9   postgresql-common
  10 Suggested packages:
  11   oidentd ident-server locales-all postgresql-doc-9.1
  12 The following NEW packages will be installed:
  13   libpq5 postgresql postgresql-9.1 postgresql-client-9.1
  14   postgresql-client-common postgresql-common
  15 0 upgraded, 6 newly installed, 0 to remove and 24 not upgraded.
  16 Need to get 0 B/5,487 kB of archives.
  17 After this operation, 15.5 MB of additional disk space will be used.
  18 Do you want to continue [Y/n]? 
  19 Preconfiguring packages ...
  20 Selecting previously unselected package libpq5.
  21 (Reading database ... 60757 files and directories currently installed.)
  22 Unpacking libpq5 (from .../libpq5_9.1.4-0ubuntu12.04_i386.deb) ...
  23 Selecting previously unselected package postgresql-client-common.
  24 Unpacking postgresql-client-common (from .../postgresql-client-common_129_all.deb) ...
  25 Selecting previously unselected package postgresql-client-9.1.
  26 Unpacking postgresql-client-9.1 (from .../postgresql-client-9.1_9.1.4-0ubuntu12.04_i386.deb) ...
  27 Selecting previously unselected package postgresql-common.
  28 Unpacking postgresql-common (from .../postgresql-common_129_all.deb) ...
  29 Adding 'diversion of /usr/bin/pg_config to /usr/bin/pg_config.libpq-dev by postgresql-common'
  30 Selecting previously unselected package postgresql-9.1.
  31 Unpacking postgresql-9.1 (from .../postgresql-9.1_9.1.4-0ubuntu12.04_i386.deb) ...
  32 Selecting previously unselected package postgresql.
  33 Unpacking postgresql (from .../postgresql_9.1+129_all.deb) ...
  34 Processing triggers for man-db ...
  35 Processing triggers for ureadahead ...
  36 ureadahead will be reprofiled on next reboot
  37 Setting up libpq5 (9.1.4-0ubuntu12.04) ...
  38 Setting up postgresql-client-common (129) ...
  39 Setting up postgresql-client-9.1 (9.1.4-0ubuntu12.04) ...
  40 update-alternatives: using /usr/share/postgresql/9.1/man/man1/psql.1.gz to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode.
  41 Setting up postgresql-common (129) ...
  42 Adding user postgres to group ssl-cert
  43 Building PostgreSQL dictionaries from installed myspell/hunspell packages...
  44   en_us
  45 Setting up postgresql-9.1 (9.1.4-0ubuntu12.04) ...
  46 Creating new cluster (configuration: /etc/postgresql/9.1/main, data: /var/lib/postgresql/9.1/main)...
  47 Moving configuration file /var/lib/postgresql/9.1/main/postgresql.conf to /etc/postgresql/9.1/main...
  48 Moving configuration file /var/lib/postgresql/9.1/main/pg_hba.conf to /etc/postgresql/9.1/main...
  49 Moving configuration file /var/lib/postgresql/9.1/main/pg_ident.conf to /etc/postgresql/9.1/main...
  50 Configuring postgresql.conf to use port 5432...
  51 update-alternatives: using /usr/share/postgresql/9.1/man/man1/postmaster.1.gz to provide /usr/share/man/man1/postmaster.1.gz (postmaster.1.gz) in auto mode.
  52  * Starting PostgreSQL 9.1 database server                               [ OK ] 
  53 Setting up postgresql (9.1+129) ...
  54 Processing triggers for libc-bin ...
  55 ldconfig deferred processing now taking place
  56 root@trainingday:~# 

Create PostgreSQL user and database

   1 root@trainingday:~# su - postgres
   2 postgres@trainingday:~$ createuser galaxy
   3 Shall the new role be a superuser? (y/n) n
   4 Shall the new role be allowed to create databases? (y/n) n
   5 Shall the new role be allowed to create more new roles? (y/n) n
   6 postgres@trainingday:~$ createdb -O galaxy galaxy
   7 postgres@trainingday:~$ 

Start Galaxy for the first time

This is necessary because run.sh contains a number of setup steps that need to happen before Galaxy starts the first time.

   1 postgres@trainingday:~$ exit
   2 logout
   3 root@trainingday:~# su - galaxy
   4 galaxy@trainingday:~$ cd galaxy-dist
   5 galaxy@trainingday:~/galaxy-dist$ sh run.sh --reload
   6   ... a lot of output ...
   7 serving on http://127.0.0.1:8080
   8 ^C^C caught in monitor process
   9 
  10 galaxy@trainingday:~/galaxy-dist$ 

Install an init script to start Galaxy automatically

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:~# cd /etc/init.d
   4 root@trainingday:/etc/init.d# vim galaxy

In /etc/init.d/galaxy, paste the following:

   1 #!/bin/bash
   2 
   3 # Author: James Casbon, 2009
   4 
   5 ### BEGIN INIT INFO
   6 # Provides:             galaxy
   7 # Required-Start:       $network $local_fs $mysql
   8 # Required-Stop:
   9 # Default-Start:        2 3 4 5
  10 # Default-Stop:         0 1 6
  11 # Short-Description:    Galaxy
  12 ### END INIT INFO
  13 
  14 . /lib/lsb/init-functions
  15 
  16 USER="galaxy"
  17 GROUP="galaxy"
  18 DIR="/home/galaxy/galaxy-dist/"
  19 PYTHON="/usr/bin/python"
  20 OPTS="./scripts/paster.py serve universe_wsgi.ini"
  21 LOGDIR="/home/galaxy/galaxy-dist/log"
  22 RUNDIR="/var/run"
  23 
  24 case "${1:-''}" in
  25   'start')
  26            [ ! -d "$LOGDIR" ] && (mkdir -p $LOGDIR; chown $USER:$GROUP $LOGDIR)
  27            servers=`sed -n 's/^\[server:\(.*\)\]/\1/  p' $DIR/universe_wsgi.ini | xargs echo`
  28            for server in $servers; do
  29                PIDFILE="$RUNDIR/galaxy_$server.pid"
  30                SERVER_NAME="--server-name=$server"
  31                LOG_FILE="--log-file=$LOGDIR/$server.log"
  32                log_daemon_msg "Starting Galaxy $server"
  33                if start-stop-daemon --chuid $USER --group $GROUP --start --make-pidfile \
  34                          --pidfile $PIDFILE --background --chdir $DIR --exec $PYTHON -- $OPTS $SERVER_NAME $LOG_FILE; then
  35                  log_end_msg 0
  36                else
  37                  log_end_msg 1
  38                fi
  39            done
  40         ;;
  41   'stop')
  42            servers=`sed -n 's/^\[server:\(.*\)\]/\1/  p' $DIR/universe_wsgi.ini | xargs echo`
  43            for server in $servers; do
  44                PIDFILE="$RUNDIR/galaxy_$server.pid"
  45                log_daemon_msg "Stopping Galaxy $server" 
  46                if start-stop-daemon --stop --pidfile $PIDFILE; then
  47                  log_end_msg 0
  48                else
  49                  log_end_msg 1
  50                fi
  51            done
  52         ;;
  53   'restart')
  54            # restart commands here
  55            $0 stop
  56            $0 start
  57                            
  58         ;;
  59   *)      # no parameter specified
  60         echo "Usage: $SELF start|stop|restart|reload|force-reload|status"
  61         exit 1
  62         ;;
  63 esac

Once saved, continue with:

   1 root@trainingday:/etc/init.d# chmod +x galaxy
   2 root@trainingday:/etc/init.d# update-rc.d galaxy defaults
   3  Adding system startup for /etc/init.d/galaxy ...
   4    /etc/rc0.d/K20galaxy -> ../init.d/galaxy
   5    /etc/rc1.d/K20galaxy -> ../init.d/galaxy
   6    /etc/rc6.d/K20galaxy -> ../init.d/galaxy
   7    /etc/rc2.d/S20galaxy -> ../init.d/galaxy
   8    /etc/rc3.d/S20galaxy -> ../init.d/galaxy
   9    /etc/rc4.d/S20galaxy -> ../init.d/galaxy
  10    /etc/rc5.d/S20galaxy -> ../init.d/galaxy
  11 root@trainingday:/etc/init.d# 

Start Galaxy from the init script

   1 root@trainingday:/etc/init.d# /etc/init.d/galaxy start
   2  * Starting Galaxy main                                                  [ OK ] 
   3 root@trainingday:/etc/init.d# 

Galaxy can now be accessed at http://localhost:8080/

Install nginx

Note that under Debian/Ubuntu, nginx-extras contains 3rd party modules, including the nginx_upload_module, which we need for the advanced nginx config.  This module may also be available in nginx packages for Fedora-based distributions, but if not, you may have to compile nginx by hand to get the upload module.

   1 root@trainingday:/etc/init.d# apt-get install nginx-extras
   2 Reading package lists... Done
   3 Building dependency tree       
   4 Reading state information... Done
   5 The following extra packages will be installed:
   6   liblua5.1-0 libperl5.14 nginx-common
   7 The following NEW packages will be installed:
   8   liblua5.1-0 libperl5.14 nginx-common nginx-extras
   9 0 upgraded, 4 newly installed, 0 to remove and 24 not upgraded.
  10 Need to get 0 B/1,448 kB of archives.
  11 After this operation, 3,481 kB of additional disk space will be used.
  12 Do you want to continue [Y/n]? 
  13 Selecting previously unselected package liblua5.1-0.
  14 (Reading database ... 61153 files and directories currently installed.)
  15 Unpacking liblua5.1-0 (from .../liblua5.1-0_5.1.4-12ubuntu1_i386.deb) ...
  16 Selecting previously unselected package libperl5.14.
  17 Unpacking libperl5.14 (from .../libperl5.14_5.14.2-6ubuntu2_i386.deb) ...
  18 Selecting previously unselected package nginx-common.
  19 Unpacking nginx-common (from .../nginx-common_1.1.19-1_all.deb) ...
  20 Selecting previously unselected package nginx-extras.
  21 Unpacking nginx-extras (from .../nginx-extras_1.1.19-1_i386.deb) ...
  22 Processing triggers for ufw ...
  23 Processing triggers for ureadahead ...
  24 Processing triggers for man-db ...
  25 Setting up liblua5.1-0 (5.1.4-12ubuntu1) ...
  26 Setting up libperl5.14 (5.14.2-6ubuntu2) ...
  27 Setting up nginx-common (1.1.19-1) ...
  28 Setting up nginx-extras (1.1.19-1) ...
  29 Processing triggers for libc-bin ...
  30 ldconfig deferred processing now taking place
  31 root@trainingday:/etc/init.d# 

Configure and start nginx

The configuration of proxy servers is explained in the wiki at Admin/Config/Performance/nginx Proxy and Admin/Config/Apache Proxy.

   1 root@trainingday:/etc/init.d# cd /etc/nginx/sites-available/
   2 root@trainingday:/etc/nginx/sites-available# vim galaxy

In /etc/nginx/sites-available/galaxy, paste the following:

   1 # this file is included inside http {}
   2 
   3 # gzip is enabled in nginx.conf, but these override some of the other gzip defaults
   4 gzip_vary on;
   5 gzip_comp_level 4;
   6 gzip_proxied any;
   7 gzip_types text/plain text/css application/x-javascript text/xml application/xml text/javascript application/json;
   8 gzip_buffers 16 8k;
   9 
  10 # define the proxied application
  11 upstream galaxy_app {
  12         server localhost:8080;
  13         server localhost:8081;
  14 }
  15 
  16 # http server directives
  17 server {
  18 
  19         # maximum file upload size
  20         client_max_body_size 10G;
  21 
  22         # pass most requests to the proxied Galaxy application
  23         location / {
  24                 proxy_pass              http://galaxy_app;
  25                 proxy_set_header        X-Forwarded-Host $host;
  26                 proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
  27         }
  28 
  29         # directly handle file downloads in nginx
  30         location /_x_accel_redirect/ {
  31                 internal;
  32                 alias /;
  33         }
  34 
  35         # directly handle file uploads in nginx
  36         location /_upload {
  37                 upload_store /home/galaxy/nginx_upload_store;
  38                 upload_pass_form_field "";
  39                 upload_set_form_field "__${upload_field_name}__is_composite" "true";
  40                 upload_set_form_field "__${upload_field_name}__keys" "name path";
  41                 upload_set_form_field "${upload_field_name}_name" "$upload_file_name";
  42                 upload_set_form_field "${upload_field_name}_path" "$upload_tmp_path";
  43                 upload_pass_args on;
  44                 upload_pass /_upload_done;
  45         }
  46         location /_upload_done {
  47                 set $dst /tool_runner/index;
  48                 if ($args ~ nginx_redir=([^&]+)) {
  49                         set $dst $1;
  50                 }
  51                 rewrite "" $dst;
  52         }
  53 
  54         # directly serve static content in nginx
  55         location /static {
  56                 alias /home/galaxy/galaxy-dist/static;
  57                 expires 24h;
  58         }
  59         location /static/style {
  60                 alias /home/galaxy/galaxy-dist/static/june_2007_style/blue;
  61                 expires 24h;
  62         }
  63         location /static/scripts {
  64                 alias /home/galaxy/galaxy-dist/static/scripts/packed;
  65                 expires 24h;
  66         }
  67         location /favicon.ico {
  68                 alias /home/galaxy/galaxy-dist/static/favicon.ico;
  69                 expires 24h;
  70         }
  71         location /robots.txt {
  72                 alias /home/galaxy/galaxy-dist/static/robots.txt;
  73                 expires 24h;
  74         }
  75 }

Once saved, continue with:

   1 root@trainingday:/etc/nginx/sites-available# cd ../sites-enabled/
   2 root@trainingday:/etc/nginx/sites-enabled# rm default 
   3 root@trainingday:/etc/nginx/sites-enabled# ln -s /etc/nginx/sites-available/galaxy 
   4 root@trainingday:/etc/nginx/sites-enabled# cd ..
   5 root@trainingday:/etc/nginx# vim nginx.conf

In /etc/nginx/nginx.conf, change the first line:

   1 user www-data;

To:

   1 user galaxy;

Once saved, continue with:

   1 root@trainingday:/etc/nginx# /etc/init.d/nginx start
   2 Starting nginx: nginx.
   3 root@trainingday:/etc/nginx# 

Galaxy can now be accessed at http://localhost/

Configure Galaxy for nginx upload/download

   1 root@trainingday:/etc/nginx# su - galaxy
   2 galaxy@trainingday:~$ cd galaxy-dist/
   3 galaxy@trainingday:~/galaxy-dist$ vim universe_wsgi.ini

In universe_wsgi.ini, set the following:

  • nginx_x_accel_redirect_base = /_x_accel_redirect - This is the internal URL used by nginx to serve files for download.  It must match the location set in the nginx config above.

  • nginx_upload_store = /home/galaxy/nginx_upload_store - This is the directory that files uploaded to the nginx_upload_module will be saved to.  It must match the path set in the nginx config above.

  • nginx_upload_path = /_upload - This is the internal URL used by nginx to redirect the client once the upload is complete.  It must match the location set in the nginx config above.

Once saved, continue with:

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:/etc/nginx# /etc/init.d/galaxy restart
   4  * Stopping Galaxy main                                                  [ OK ] 
   5  * Starting Galaxy main                                                  [ OK ] 
   6 root@trainingday:/etc/nginx# 

Install ProFTPd

   1 root@trainingday:/etc/nginx# apt-get install proftpd proftpd-mod-pgsql
   2 Reading package lists... Done
   3 Building dependency tree       
   4 Reading state information... Done
   5 Note, selecting 'proftpd-basic' instead of 'proftpd'
   6 The following extra packages will be installed:
   7   libcap2
   8 Suggested packages:
   9   proftpd-doc proftpd-mod-mysql proftpd-mod-ldap proftpd-mod-odbc
  10   proftpd-mod-sqlite openbsd-inetd inet-superserver
  11 The following NEW packages will be installed:
  12   libcap2 proftpd-basic proftpd-mod-pgsql
  13 0 upgraded, 3 newly installed, 0 to remove and 24 not upgraded.
  14 Need to get 0 B/2,146 kB of archives.
  15 After this operation, 4,688 kB of additional disk space will be used.
  16 Do you want to continue [Y/n]? 
  17 Preconfiguring packages ...
  18 Selecting previously unselected package libcap2.
  19 (Reading database ... 61206 files and directories currently installed.)
  20 Unpacking libcap2 (from .../libcap2_1%3a2.22-1ubuntu3_i386.deb) ...
  21 Selecting previously unselected package proftpd-basic.
  22 Unpacking proftpd-basic (from .../proftpd-basic_1.3.4a-1_i386.deb) ...
  23 Selecting previously unselected package proftpd-mod-pgsql.
  24 Unpacking proftpd-mod-pgsql (from .../proftpd-mod-pgsql_1.3.4a-1_i386.deb) ...
  25 Processing triggers for ureadahead ...
  26 Processing triggers for man-db ...
  27 Setting up libcap2 (1:2.22-1ubuntu3) ...
  28 Setting up proftpd-basic (1.3.4a-1) ...
  29 Warning: The home dir /var/run/proftpd you specified can't be accessed: No such file or directory
  30 Adding system user `proftpd' (UID 109) ...
  31 Adding new user `proftpd' (UID 109) with group `nogroup' ...
  32 Not creating home directory `/var/run/proftpd'.
  33 Adding system user `ftp' (UID 110) ...
  34 Adding new user `ftp' (UID 110) with group `nogroup' ...
  35 Creating home directory `/srv/ftp' ...
  36 `/usr/share/proftpd/templates/welcome.msg' -> `/srv/ftp/welcome.msg.proftpd-new'
  37  * Starting ftp server proftpd                                                  trainingday proftpd[7609]: mod_tls/2.4.3: compiled using OpenSSL version 'OpenSSL 1.0.0e 6 Sep 2011' headers, but linked to OpenSSL version 'OpenSSL 1.0.1 14 Mar 2012' library
  38 trainingday proftpd[7609]: mod_sftp/0.9.8: compiled using OpenSSL version 'OpenSSL 1.0.0e 6 Sep 2011' headers, but linked to OpenSSL version 'OpenSSL 1.0.1 14 Mar 2012' library
  39 trainingday proftpd[7609]: mod_tls_memcache/0.1: notice: unable to register 'memcache' SSL session cache: Memcache support not enabled
  40                                                                          [ OK ]
  41 Setting up proftpd-mod-pgsql (1.3.4a-1) ...
  42 Processing triggers for libc-bin ...
  43 ldconfig deferred processing now taking place
  44 root@trainingday:/etc/nginx# 

When prompted by debconf with the following question, select standalone:

 ┌─────────────────────────┤ ProFTPD configuration ├─────────────────────────┐  
 │ ProFTPD can be run either as a service from inetd, or as a standalone     │  
 │ server. Each choice has its own benefits. With only a few FTP             │  
 │ connections per day, it is probably better to run ProFTPD from inetd in   │  
 │ order to save resources.                                                  │  
 │                                                                           │  
 │ On the other hand, with higher traffic, ProFTPD should run as a           │  
 │ standalone server to avoid spawning a new process for each incoming       │  
 │ connection.                                                               │  
 │                                                                           │  
 │ Run proftpd:                                                              │  
 │                                                                           │  
 │                                from inetd                                 │  
 │                                standalone                                 │  
 │                                                                           │  
 │                                                                           │  
 │                                  <Ok>                                     │  
 │                                                                           │  
 └───────────────────────────────────────────────────────────────────────────┘  

Configure ProFTPd

The configuration of ProFTPd is explained in the wiki at Admin/Config/Upload via FTP.

   1 root@trainingday:/etc/nginx# cd /etc/proftpd/
   2 root@trainingday:/etc/proftpd# vim modules.conf 

In /etc/proftpd/modules.conf, uncomment the following 3 directives:

   1 LoadModule mod_sql.c
   2 LoadModule mod_sql_postgres.c
   3 LoadModule mod_sql_passwd.c

Once saved, continue with:

   1 root@trainingday:/etc/proftpd# vim proftpd.conf 

In /etc/proftpd/proftpd.conf, change:

   1 User                            proftpd
   2 Group                           nogroup

To:

   1 User                            galaxy
   2 Group                           galaxy

Once saved, continue with:

   1 root@trainingday:/etc/proftpd# cd conf.d
   2 root@trainingday:/etc/proftpd/conf.d# vim galaxy.conf

In /etc/proftpd/conf.d/galaxy.conf, paste the following:

   1 # Cause every FTP user to be "jailed" (chrooted) into their home directory
   2 DefaultRoot                     ~
   3 
   4 # Automatically create home directory if it doesn't exist
   5 CreateHome                      on dirmode 700
   6 
   7 # Allow users to overwrite their files
   8 AllowOverwrite                  on
   9 
  10 # Allow users to resume interrupted uploads
  11 AllowStoreRestart               on
  12 
  13 # Bar use of SITE CHMOD
  14 <Limit SITE_CHMOD>
  15   DenyAll
  16 </Limit>
  17 
  18 # Bar use of RETR (download) since this is not a public file drop
  19 <Limit RETR>
  20   DenyAll
  21 </Limit>
  22 
  23 # Do not authenticate against real (system) users
  24 AuthPAM                         off
  25 
  26 # Set up mod_sql_password - Galaxy passwords are stored as hex-encoded SHA1
  27 SQLPasswordEngine               on
  28 SQLPasswordEncoding             hex
  29 
  30 # Set up mod_sql to authenticate against the Galaxy database
  31 SQLEngine                       on
  32 SQLBackend                      postgres
  33 SQLConnectInfo                  galaxy@/var/run/postgresql galaxy
  34 SQLAuthTypes                    SHA1
  35 SQLAuthenticate                 users
  36 
  37 # An empty directory in case chroot fails
  38 SQLDefaultHomedir               /var/lib/proftpd/empty
  39 
  40 # Define a custom query for lookup that returns a passwd-like entry.  UID and GID should match your Galaxy user.
  41 SQLUserInfo                     custom:/LookupGalaxyUser
  42 SQLNamedQuery                   LookupGalaxyUser SELECT "email,password,'1001','1001','/home/galaxy/ftp/%U','/bin/bash' FROM galaxy_user WHERE email='%U'"

Once saved, continue with:

   1 root@trainingday:/etc/proftpd/conf.d# mkdir -p /var/lib/proftpd/empty
   2 root@trainingday:/etc/proftpd/conf.d# su - galaxy
   3 galaxy@trainingday:~$ cd galaxy-dist/
   4 galaxy@trainingday:~/galaxy-dist$ vim universe_wsgi.ini

In universe_wsgi.ini, set the following:

  • ftp_upload_dir = /home/galaxy/ftp - The directory where files uploaded via FTP will be placed.

  • ftp_upload_site = localhost - The FTP site hostname displayed on the upload form.

Once saved, continue with:

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:/etc/proftpd/conf.d# /etc/init.d/proftpd restart
   4  * Stopping ftp server proftpd                                           [ OK ] 
   5  * Starting ftp server proftpd                                                  trainingday proftpd[7908]: mod_tls/2.4.3: compiled using OpenSSL version 'OpenSSL 1.0.0e 6 Sep 2011' headers, but linked to OpenSSL version 'OpenSSL 1.0.1 14 Mar 2012' library
   6 trainingday proftpd[7908]: mod_sftp/0.9.8: compiled using OpenSSL version 'OpenSSL 1.0.0e 6 Sep 2011' headers, but linked to OpenSSL version 'OpenSSL 1.0.1 14 Mar 2012' library
   7 trainingday proftpd[7908]: mod_tls_memcache/0.1: notice: unable to register 'memcache' SSL session cache: Memcache support not enabled
   8                                                                          [ OK ]
   9 root@trainingday:/etc/proftpd/conf.d# /etc/init.d/galaxy restart
  10  * Stopping Galaxy main                                                  [ OK ] 
  11  * Starting Galaxy main                                                  [ OK ] 
  12 root@trainingday:/etc/proftpd/conf.d# 

The warnings can safely be ignored.

Configure Galaxy to use Sun Grid Engine

The configuration of Galaxy's cluster interface is explained in the wiki at Admin/Config/Performance/Cluster.

A bit of work occurred behind the scenes for this step.  I preinstalled and preconfigured SGE in the VM, since setting up your DRM is outside of the scope of Galaxy configuration.

   1 root@trainingday:/etc/proftpd/conf.d# cd /etc/init.d
   2 root@trainingday:/etc/init.d# vim galaxy

In /etc/init.d/galaxy, add the following to the section at the top where other environment variables are set:

   1 DRMAA_LIBRARY_PATH="/usr/lib/libdrmaa.so.1.0"
   2 SGE_ROOT="/var/lib/gridengine"
   3 export DRMAA_LIBRARY_PATH SGE_ROOT

Once saved, continue with:

   1 root@trainingday:/etc/init.d# su - galaxy
   2 galaxy@trainingday:~$ cd galaxy-dist
   3 galaxy@trainingday:~/galaxy-dist$ vim universe_wsgi.ini

In universe_wsgi.ini, set the following:

  • start_job_runners = drmaa - Start the DRMAA job runner. 

  • default_cluster_job_runner = drmaa:/// - By default, run jobs on the cluster.

  • Comment out the local:///} tool overrides in the {{{[galaxy:tool_runners] section.

Once saved, continue with:

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:/etc/init.d# /etc/init.d/galaxy restart
   4  * Stopping Galaxy main                                                  [ OK ] 
   5  * Starting Galaxy main                                                  [ OK ] 
   6 root@trainingday:/etc/init.d# 

Run multiple Galaxy processes

The configuration of scaling with multiple processes is explained in the wiki at Admin/Config/Performance/Scaling.

   1 root@trainingday:/etc/init.d# /etc/init.d/galaxy stop
   2  * Stopping Galaxy main                                                  [ OK ] 
   3 root@trainingday:/etc/init.d# su - galaxy
   4 galaxy@trainingday:~$ cd galaxy-dist/
   5 galaxy@trainingday:~/galaxy-dist$ vim universe_wsgi.ini

In universe_wsgi.ini, comment out [server:main] and all of that section's contents.  Then add the following sections to the top of the file:

   1 [server:web0]
   2 use = egg:Paste#http
   3 port = 8080
   4 use_threadpool = True
   5 
   6 [server:web1]
   7 use = egg:Paste#http
   8 port = 8081
   9 use_threadpool = True
  10 
  11 [server:manager]
  12 use = egg:Paste#http
  13 port = 8085
  14 use_threadpool = True
  15 
  16 [server:handler0]
  17 use = egg:Paste#http
  18 port = 8090
  19 use_threadpool = True
  20 
  21 [server:handler1]
  22 use = egg:Paste#http
  23 port = 8091
  24 use_threadpool = True

Further down in the file, set the following:

  • job_manager = manager - Specifies that the server named 'manager' defined above should have the role of assigning jobs to handlers.

  • job_handlers = handler0,handler1 - Specifies that the servers named 'handler0' and 'handler1' should have the role of running, tracking, and finishing jobs.

Once saved, continue with:

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:/etc/init.d# /etc/init.d/galaxy start
   4  * Starting Galaxy web0                                                  [ OK ] 
   5  * Starting Galaxy web1                                                  [ OK ] 
   6  * Starting Galaxy manager                                               [ OK ] 
   7  * Starting Galaxy handler0                                              [ OK ] 
   8  * Starting Galaxy handler1                                              [ OK ] 
   9 root@trainingday:/etc/init.d# 

Configure the Distributed Object Store

The distributed object store allows you to balance datasets across multiple filesystems and multiple file servers.

   1 root@trainingday:/etc/init.d# su - galaxy
   2 galaxy@trainingday:~$ cd galaxy-dist/
   3 galaxy@trainingday:~/galaxy-dist$ cp distributed_object_store_conf.xml.sample distributed_object_store_conf.xml
   4 galaxy@trainingday:~/galaxy-dist$ vim universe_wsgi.ini

In universe_wsgi.ini, set the following:

  • object_store = distributed

  • distributed_object_store_config_file = distributed_object_store_conf.xml

Once saved, continue with:

   1 galaxy@trainingday:~/galaxy-dist$ exit
   2 logout
   3 root@trainingday:/etc/init.d# /etc/init.d/galaxy restart
   4  * Stopping Galaxy web0                                                  [ OK ] 
   5  * Stopping Galaxy web1                                                  [ OK ] 
   6  * Stopping Galaxy manager                                               [ OK ] 
   7  * Stopping Galaxy handler0                                              [ OK ] 
   8  * Stopping Galaxy handler1                                              [ OK ] 
   9  * Starting Galaxy web0                                                  [ OK ] 
  10  * Starting Galaxy web1                                                  [ OK ] 
  11  * Starting Galaxy manager                                               [ OK ] 
  12  * Starting Galaxy handler0                                              [ OK ] 
  13  * Starting Galaxy handler1                                              [ OK ] 
  14 root@trainingday:/etc/init.d#