CCL | Software | Install | Manuals | Forum | Papers
CCL Home

Research

Software Community Operations

Using the CCL Storage Pool

To begin, add the the cctools to your path:
% setenv PATH /afs/crc.nd.edu/group/ccl/software/cctools/bin:$PATH

Or, download a recent version of the software.

Next, view the currently available storage devices by using the storage catalog.

The simplest way to access a Chirp server is via the Chirp command line tool. This allows you to connect to a server, copy files, and manage directories, much like an FTP client:

% chirp
 chirp::> open wombat00.cselab.nd.edu
connected to wombat00.cselab.nd.edu as unix:dthain
 chirp:wombat00.cselab.nd.edu:/> mkdir mydata
 chirp:wombat00.cselab.nd.edu:/> cd mydata
 chirp:wombat00.cselab.nd.edu:/mydata> put /tmp/bigfile
file /tmp/bigfile -> /mydata/bigfile (11.01 MB/s)
 chirp:wombat00.cselab.nd.edu:/mydata> ls -la
dir      4096 .                                        Fri Sep 10 12:40:27 2004
dir      4096 ..                                       Fri Sep 10 12:40:27 2004
file 104857600 bigfile                                 Fri Sep 10 12:53:21 2004
 chirp:wombat00.cselab.nd.edu:/mydata>

In scripts, you may find it easier to use the standalone commands chirp_get and chirp_put, which move single files to and from a Chirp server. These commands also allow for streaming data, which can be helpful in a shell pipeline. Also, the -f option to both commands allows you to follow a file, much like the Unix tail command:

 % tar cvzf archive.tar.gz ~/mydata
 % chirp_put archive.tar.gz wombat00.cselab.nd.edu /mydata/archive.tar.gz
 % ...
 % chirp_get wombat00.cselab.nd.edu /mydata/archive.tar.gz - | tar xvzf
 % ...
 % chirp_get -f wombat00.cselab.nd.edu /mydata/logfile -
 %

An easier way to access Chirp servers is by using a tool called Parrot. Parrot is a personal virtual filesystem: it "speaks" remote I/O operations on behalf of ordinary programs. For example, you can use Parrot with your regular shell to access Chirp servers like so:

 % parrot_run tcsh
 ...
 % cd /chirp/wombat00.cselab.nd.edu
 % mkdir mydir
 % cd mydir 
 % cp /tmp/bigfile .
 % ls -la
total 804
drwx------    2 condor   dip          4096 Sep 10 12:40 .
drwx------    2 condor   dip          4096 Sep 10 12:40 ..
-rw-r--r--    1 condor   dip      104857600 Sep 10 12:57 bigfile
 % cp /http/www.cse.nd.edu temp.html
 % vi temp.html
 %

Parrot is certainly the most convenient way to access storage, but it has some limitations: it only works on Linux, and imposes a performance penalty. (This is because Parrot makes an extra data copy in the process of handling a program's system calls.)

The fourth way to access the storage pool is write your own programs that access the Chirp C interface. You must compile and link against the following file in the ordinary way:

..../ccl/software/devel/include/chirp_client.h
..../ccl/software/devel/include/chirp_reli.h
..../ccl/software/devel/lib/libchirp.a
The chirp_client.h interface allows you to explicitly connect to a server and open, close, read, and write files, much as in a traditional Unix interface. This interface is unreliable in the sense that a broken connection will cause all further operations to fail. To recover, you must explicitly re-connect to the server.

The chirp_reli.h interface is a reliable version of the chirp_client interface. The programmer need not explicitly connect or disconnect to servers, but simply names the host and file to access. The library transparently handles connection as well as recovery from temporary failures.

Related Links