Locked History Actions

Admin/Config/UploadviaFTP

Enabling upload to Galaxy via FTP

To allow users to upload files to Galaxy via FTP, you'll need to configure Galaxy and install an FTP server. After everything is configured users will be able to upload their files through the FTP server and then select them for importing in the upload dialog in Galaxy.

For help with uploading data via FTP on Galaxy Main, please see this tutorial.

Install some FTP server

Although there is no specific required server, we use ProFTPD for our public site since it supports all the things we'll need to be able to do, such as authenticating against the Galaxy database. We recommend you to use the same FTP server as the configurations we provide are targeting it. You can also browse the list of alternative FTP servers at http://en.wikipedia.org/wiki/List_of_FTP_server_software

Configure Galaxy

The first step is to choose a directory into which your users will upload files. Preferably this will be on the same filesystem as Galaxy's datasets (by default, galaxy_dist/database/files/). The FTP server will create subdirectories inside of this directory which match the user's email address. Likewise, Galaxy will expect to find email-named subdirectories at that path. This directory should be set in the config file (galaxy.ini) as ftp_upload_dir.

In the config file, you'll also want to set ftp_upload_site to the hostname your users should connect to via FTP. This will be provided in the help text on the Upload File form.

Allow your FTP server to read Galaxy's database

You'll need to grant a user access to read emails and passwords from the Galaxy database. Although the user Galaxy connects with could be used, I prefer to use a least-privilege setup wherein a separate user is created for the FTP server which has permission to SELECT from the galaxy_user table and nothing else. In postgres this is accomplished with:

   1 postgres@dbserver% createuser -SDR galaxyftp
   2 postgres@dbserver% psql galaxydb
   3 Welcome to psql 8.X.Y, the PostgreSQL interactive terminal.
   4 
   5 Type:  \copyright for distribution terms
   6        \h for help with SQL commands
   7        \? for help with psql commands
   8        \g or terminate with semicolon to execute query
   9        \q to quit
  10 
  11 galaxydb=# ALTER ROLE galaxyftp PASSWORD 'dbpassword';
  12 ALTER ROLE
  13 galaxydb=# GRANT SELECT ON galaxy_user TO galaxyftp; 
  14 GRANT

Configuring ProFTPD

By default, Galaxy stores passwords using PBKDF2. It's possible to disable this using the use_pbkdf2 = False setting in galaxy.ini. Once disabled, any new passwords created will be stored in an older hex-encoded SHA1 format. Because of this, it's possible to have both PBKDF2 and SHA1 passwords in your database (especially if your server has been around since before PBKDF2 support was added). Although this is fine (Galaxy can read passwords in either format), ProFTPD will expect them in one format or the other (although with some amount of hackery it could probably be made to read both).

Because of this, you'll need to choose one or the other in your Galaxy config (PBKDF2 is more secure and therefore preferred) and configure ProFTPD accordingly. If users cannot log in because their password is stored in the wrong format, they can simply use Galaxy's password change form to set their password, which will rewrite their password using the currently configured algorithm.

For more hints on the PBKDF2 configuration, see the fantastic blog post FTP upload to Galaxy using ProFTPd and PBKDF2 by Peter Briggs (which was used to create the documentation below).

Although any FTP server should work, our public site uses ProFTPD.  You'll need the following extra modules for ProFTPD:

  • mod_sql
  • mod_sql_postgres or mod_sql_mysql
  • mod_sql_passwd

We compile by hand using the following configure arguments (OpenSSL is prebuilt and statically linked), you should read the INSTALL file that come with the proftpd source distribution. At least you should consider if you need to use any of these options "install_user=<user> install_group=<group> ./configure --sysconfdir=/etc --localstatedir=/var":

./configure --prefix=/foo --disable-auth-file --disable-ncurses --disable-ident --disable-shadow --enable-openssl --with-modules=mod_sql:mod_sql_postgres:mod_sql_passwd --with-includes=/usr/postgres/9.1-pgdg/include:`pwd`/../openssl/.openssl/include --with-libraries=/usr/postgres/9.1-pgdg/lib/64:`pwd`/../openssl/.openssl/lib

An example configuration follows, assuming ftp_upload_dir = /home/nate/galaxy_dist/database/ftp in the Galaxy config file:

   1 # Basics, some site-specific
   2 ServerName                      "Public Galaxy FTP"
   3 ServerType                      standalone
   4 DefaultServer                   on
   5 Port                            21
   6 Umask                           077
   7 SyslogFacility                  DAEMON
   8 SyslogLevel                     debug
   9 MaxInstances                    30
  10 
  11 # This User & Group should be set to the actual user and group name which matche the UID & GID you will specify later in the SQLNamedQuery.
  12 User                            nobody
  13 Group                           nogroup
  14 DisplayConnect                  /etc/opt/local/proftpd_welcome.txt
  15 
  16 # Passive port range for the firewall
  17 PassivePorts                    30000 40000
  18 
  19 # Cause every FTP user to be "jailed" (chrooted) into their home directory
  20 DefaultRoot                     ~
  21 
  22 # Automatically create home directory if it doesn't exist
  23 CreateHome                      on dirmode 700
  24 
  25 # Allow users to overwrite their files
  26 AllowOverwrite                  on
  27 
  28 # Allow users to resume interrupted uploads
  29 AllowStoreRestart               on
  30 
  31 # Bar use of SITE CHMOD
  32 <Limit SITE_CHMOD>
  33     DenyAll
  34 </Limit>
  35 
  36 # Bar use of RETR (download) since this is not a public file drop
  37 <Limit RETR>
  38     DenyAll
  39 </Limit>
  40 
  41 # Do not authenticate against real (system) users
  42 <IfModule mod_auth_pam.c>
  43 AuthPAM                         off
  44 </IfModule>
  45 
  46 # Common SQL authentication options
  47 SQLEngine                       on
  48 SQLPasswordEngine               on
  49 SQLBackend                      postgres
  50 SQLConnectInfo                  galaxydb@dbserver.example.org[:port] <dbuser> <dbpassword>
  51 SQLAuthenticate                 users

For PBKDF2 passwords, the following additions to proftpd.conf should work:

   1 # Configuration that handles PBKDF2 encryption
   2 # Set up mod_sql to authenticate against the Galaxy database
   3 SQLAuthTypes                    PBKDF2
   4 SQLPasswordPBKDF2               SHA256 10000 24 
   5 SQLPasswordEncoding             base64
   6  
   7 # For PBKDF2 authentication
   8 # See http://dev.list.galaxyproject.org/ProFTPD-integration-with-Galaxy-td4660295.html
   9 SQLPasswordUserSalt             sql:/GetUserSalt
  10  
  11 # Define a custom query for lookup that returns a passwd-like entry. Replace 512s with the UID and GID of the user running the Galaxy server
  12 SQLUserInfo                     custom:/LookupGalaxyUser
  13 SQLNamedQuery                   LookupGalaxyUser SELECT "email, (CASE WHEN substring(password from 1 for 6) = 'PBKDF2' THEN substring(password from 38 for 69) ELSE password END) AS password2,512,512,'/home/nate/galaxy_dist/database/ftp/%U','/bin/bash' FROM galaxy_user WHERE email='%U'"
  14  
  15 # Define custom query to fetch the password salt
  16 SQLNamedQuery                   GetUserSalt SELECT "(CASE WHEN SUBSTRING (password from 1 for 6) = 'PBKDF2' THEN SUBSTRING (password from 21 for 16) END) AS salt FROM galaxy_user WHERE email='%U'"

For SHA1 passwords, the following additions to proftpd.conf should work:

   1 # Set up mod_sql/mod_sql_password - Galaxy passwords are stored as hex-encoded SHA1
   2 SQLAuthTypes                    SHA1
   3 SQLPasswordEncoding             hex
   4 
   5 # An empty directory in case chroot fails
   6 SQLDefaultHomedir               /var/opt/local/proftpd
   7 
   8 # Define a custom query for lookup that returns a passwd-like entry. Replace 512s with the UID and GID of the user running the Galaxy server
   9 SQLUserInfo                     custom:/LookupGalaxyUser
  10 SQLNamedQuery                   LookupGalaxyUser SELECT "email,password,512,512,'/home/nate/galaxy_dist/database/ftp/%U','/bin/bash' FROM galaxy_user WHERE email='%U'"

Further security measures

FTP protocol is not encrypted by default, thus any usernames and passwords are sent over clear text to Galaxy. You may wish to implement further security measures by forcing the FTP connection to use SSL/TLS or to allow users to send their files using SFTP (a completely different protocol than FTP - see http://www.proftpd.org/docs/contrib/mod_sftp.html). Here are some extra steps that you can use with ProFTPD.

   1 <IfModule mod_sftp.c>
   2   # You must put this in a virtual host if you want it to listen on its own port. VHost != Apache Vhost.
   3   <VirtualHost IP_of_Galaxy> 
   4     # You may wish to open a new port for this so that you don't lock yourself out of SSH. I chose 2222
   5     Port 2222 
   6     SFTPEngine on
   7     AuthOrder mod_auth_unix.c mod_sql.c # If you don't do this you will get weird disconnects
   8     SFTPHostKey /etc/ssh/ssh_host_rsa_key
   9     # SFTPCiphers aes256-ctr aes192-ctr aes128-ctr # You may wish to lock it to only certain ciphers, 
  10     # but this is likely to lock out certain users
  11     RequireValidShell no
  12     MaxLoginAttempts 6
  13     ServerName                      "Galaxy SFTP"
  14     Umask                           077
  15     User                            galaxyftp
  16     Group                           galaxyftp
  17     UseFtpUsers off
  18     DefaultRoot                     ~
  19     AllowOverwrite                  on
  20     AllowStoreRestart               on
  21     # .. Other rules for directories, etc
  22     SQLEngine                       on
  23     SQLGroupInfo                    sftp_groups name id members
  24     # .. See above, the same SQL rules apply
  25   </VirtualHost>
  26 </IfModule>
  27 <IfModule mod_tls.c>
  28     TLSEngine on
  29     TLSLog /var/log/proftpd/tls.log
  30     # Make sure that users know that you have to support TLS 1.2! This is very restrictive, but likely the best
  31     TLSProtocol TLSv1.2
  32     TLSRSACertificateFile /etc/pki/tls/certs/your_cert.cer
  33     TLSRSACertificateKeyFile /etc/pki/tls/private/your.key
  34     TLSCertificateChainFile /etc/pki/tls/certs/your_intermediate.cer
  35     TLSRenegotiate none
  36     TLSCipherSuite ALL:!SSLv2:!SSLv3
  37     TLSVerifyClient off
  38     TLSRequired auth+data
  39     TLSOptions NoSessionReuseRequired
  40     ServerName                      "Galaxy FTP"
  41     ServerType                      standalone
  42     DefaultServer                   on
  43     Port                            21
  44     Umask                           077
  45     SyslogFacility                  DAEMON
  46     SyslogLevel                     debug
  47     MaxInstances                    30
  48     User                            galaxyftp
  49     Group                           galaxyftp
  50     UseFtpUsers off
  51     # Passive port range for the firewall - note that you must open these ports for this to work!
  52     # Since the FTP traffic is now encrypted, your firewall can't peak to see that it is PASSV FTP
  53     # and it will block it if you don't allow new connections one these ports.
  54     PassivePorts                    30000 30100
  55     # Cause every FTP user to be "jailed" (chrooted) into their home directory
  56     DefaultRoot                     ~
  57     AllowOverwrite                  on
  58     # Allow users to resume interrupted uploads
  59     AllowStoreRestart               on
  60     # .. Other rules for directories, etc
  61     SQLEngine                       on
  62     # .. See above, the same SQL rules apply
  63 </IfModule>

You may need to take some additional steps to get this working. Compile ProFTPD --with-modules=mod_sftp:mod_tls as well as sql. You may need to alter your PostGreSQL configuration (typically pg_hba.conf) to allow local IPv6 connections:

# IPv6 local connections:
host    all         all         ::1/128               trust

You may also need to add a table called 'groups' to allow the SFTP connection to your Galaxy database.

psql yourDB
# CREATE TABLE sftp_groups (
    id        char(5) CONSTRAINT firstkey PRIMARY KEY,
    name       varchar(40) NOT NULL,
    members        varchar(100));

With these steps, you should be able to allow users to connect to your server using secure protocols.