BXGrid User's Manual

September 2009

Overview

BXGrid is a scientific repository for biometrics research at Notre Dame. Please visit BXGrid for more information.

Beside an easy to use web portal, which let users browse,validate, share, and download biometrics datasets, we also provide a more powerful command line tool for expert users. This manual shows you how to use BXGRid command line tool to perform basic and advance operations.

You need a BXGrid account to use the tool. Please contact Dr. Douglas Thain: dthain at nd dot edu or Dr. Patrick Flynn: flynn at nd dot edu.

To use the command line tool, add the bxgrid directory to your path. If you use bash:

export PATH=/afs/nd.edu/user37/ccl/software/bxgrid/bin:${PATH}

Or if you use tcsh:

export PATH /afs/nd.edu/user37/ccl/software/bxgrid/bin:${PATH}

Basic Operations

Running bxgrid without any option will show you a quick help page

[hbui@cclws01]$ bxgrid
Use:
 bxgrid [options] <database> <function> <argument(s)> ...
Options are:
 -G Enable Chirp  debug mode
 -D Enable detail mode, show import/export information
 -M Enable Mysql  debug mode
 -u specify username
 -p specify password
 -s specify database hostname
 -o  output format for metadata. Options are -o line or -o csv
 -m export symlink instead of actual data file
 -n number of files to audit or repair. (default 1000)
Avaliable functions:
 login    Stores username and password on local computer.
          e.g. bxgrid biometrics login
 logout   Deletes username and password on local computer.
          e.g. bxgrid biometrics logout
 import   <table%gt;  <metadata file>
          Import recordings from metadata file to <table>.
          e.g. bxgrid biometrics import faces_still 200801.datafile data
 export   <table> to <metadata file> <output dir> as <dir/name schema> where <conditions>
          Export metadata + data files from <table> with conditions.
          e.g. bxgrid biometrics export irises_still to test.xml /tmp as /race/sequenceid metadata subjectid,sequenceid where TRUE limit 100
 query   <table> to <metadata file> where <condition>
          Query metadata from <table> with conditions.
          e.g. bxgrid biometrics query irises_still to test.xml where TRUE limit 100
 delete   <table>  where <conditions>
          Delete recordings from <table> with conditions.
          e.g. bxgrid biometrics delete faces_still where subjectid = 'nd1S04473'
 benchmark   <table>  <metadata file>  <data path>
          Benchmark bxgrid performance.
          e.g. bxgrid biometrics benchmark faces_still 200801-facestills.datafile~ccl/software/bxgrid/testdata/dat
 transform_set   <setname>  <function> <number of file per condor job>
          Transform a set using Condor.
          e.g. bxgrid biometrics transform_set AP100 20
 audit    Evaluate database for consistency and redundancy.
          e.g. bxgrid biometrics audit -n 1000
 repair   Like audit, but also repair damage as needed.
          e.g. bxgrid biometrics repair -n 1000

Because we are hosting multiple databases, you need to specify the database for each operation, following by the operation: import,export,query bxgrid will prompt you to enter your username and pasword everytime or you can include your username and password from command line using -u and -p options

[hbui@cclws01]$ bxgrid -u hbui -p mypassword biometrics export irises_still to test.xml /tmp as /race/sequenceid metadata subjectid,sequenceid where TRUE limit 100

Here are some bxgrid basic operations:

  1. login/logout: To avoid having to type in your username/password all the time, you can use bxgrid login operation to save your username and password

    [hbui@cclws01]$ bxgrid biometrics login
    mysql login: hbui
    password:
    

    To delete your saved password, use bxgrid logout

    [hbui@cclws01]$ bxgrid biometrics logout
    

  2. export: Simply run bxgrid <database> export <datatype> to <metadata file> <output dir> where <conditions>

    [hbui@cclws01]$ bxgrid biometrics export irises_still to test.data /tmp where subjectid = \'nd1S04473\'
    

    There are main 5 data types you can export: irises_still, faces_still, irises_mov, faces_mov and faces_3d. Default option will export all metadata and using sequenceid(if avaliable) as export filenames. You can specify metadata you need using keyword metadata, and choose output data directory and naming schema using keyword as

    [hbui@cclws01]$ bxgrid biometrics export irises_still to test.data /tmp as /race/fileid metadata subjectid,sequenceid where TRUE limit 100
    

    Output data is seperated by race and bxgrid use fileid as file name. The metadata file will look like these

    subjectid       string  nd1S04473
    sequenceid      string  04473d468
    file    file    /tmp/White/160461.tiff
    
    subjectid       string  nd1S04853
    sequenceid      string  04853d578
    file    file    /tmp/Asian/160456.tiff
    

  3. query query usage is very similar to export usage. You want to use query if you only want metadata.

    [hbui@cclws01]$ bxgrid biometrics query irises_still to test.data metadata subjectid,sequenceid where TRUE limit 100
    

    The metadata file will look like these.

    subjectid       string  nd1S04473
    sequenceid      string  04473d468
    file    file    xmlonly/04473d466.tiff
    
    subjectid       string  nd1S04853
    sequenceid      string  04853d578
    file    file    xmlonly/04473d466.tiff
    
    

    Again, default option without keyword metadata will query all metadata.

Advanced Operations

While basic operations do not change the state of BXGrid repository, advanced operations do either add, remove or modfiy BXGrid repository. The changes can be in both data and metdata. Please caution when use bxgrid advance operations.

  1. import: Use import to ingest new data to BXGrid.
    Previously using import <table%gt; <metadata file> <data path> we need to specify a metadata file and a directory where the actual data file is. Metadata file is in line format (one line per recording). An example of a metadata file

    2008-253-042-1_L-lg4000.tiff	2463	Left	tiff	Brown	09/09/2008	No	nd4T00015	Inside	nd4E00054	nd4N00020	nd4I00013	1
    2008-253-042-1_R-lg4000.tiff	2463	Right	tiff	Brown	09/09/2008	No	nd4T00015	Inside	nd4E00054	nd4N00020	nd4I00013	1
    2008-253-042-2_L-lg4000.tiff	2463	Left	tiff	Brown	09/09/2008	No	nd4T00015	Inside	nd4E00054	nd4N00020	nd4I00013	2
    

    Now we are moving to a new format for metadata file which is name value pair

    shotid	string	2008-253-042-1_L-lg4000.tiff
    subjectid	string	nd1S02463
    eye	string	Left
    format	string	tiff
    color	string	Brown
    date	string	2008-09-09 00:00:00
    glasses	string	No
    stageid	string	nd4T00015
    weather	string	Inside
    environmentid	string	nd4E00054
    sensorid	string	nd4N00020
    illuminantid1	string	nd4I00013
    shot	string	1
    state	string	unvalidated
    file	file	2008-253-042-1_L-lg4000.tiff
    
    shotid	string	2008-253-042-1_R-lg4000.tiff
    subjectid	string	nd1S02463
    eye	string	Right
    format	string	tiff
    color	string	Brown
    date	string	2008-09-09 00:00:00
    glasses	string	No
    stageid	string	nd4T00015
    weather	string	Inside
    environmentid	string	nd4E00054
    sensorid	string	nd4N00020
    illuminantid1	string	nd4I00013
    shot	string	1
    state	string	unvalidated
    file	file	2008-253-042-1_R-lg4000.tiff
    

    You only need to sepcify a metadata file: import <table%gt; <metadata file>

    bxgrid -D biometrics import faces_still test.dat
    /tmp/hbui/05213d305.NEF checksum: e4a46c6c6c93347b5d677db174b28382
    file id 1080247
    replica 2347150 sc0-14.cse.nd.edu bxgridtest/2/8/1080247_2347150.NEF
    replica 2347151 sc0-11.cse.nd.edu bxgridtest/8/1/1080247_2347151.NEF
    replica 2347152 sc0-04.cse.nd.edu bxgridtest/6/7/1080247_2347152.NEF
    /tmp/hbui/05213d306.JPG checksum: 2d402ebebdf219b59c48e5a3c3c93b32
    file id 1080248
    replica 2347153 sc0-15.cse.nd.edu bxgridtest/0/1/1080248_2347153.JPG
    replica 2347154 sc0-24.cse.nd.edu bxgridtest/8/6/1080248_2347154.JPG
    replica 2347155 sc0-08.cse.nd.edu bxgridtest/1/1/1080248_2347155.JPG
    /tmp/hbui/05213d307.NEF checksum: 10ba05351013682f74ae74c16f40866e
    file id 1080249
    replica 2347156 sc0-21.cse.nd.edu bxgridtest/5/7/1080249_2347156.NEF
    replica 2347157 sc0-28.cse.nd.edu bxgridtest/7/2/1080249_2347157.NEF
    replica 2347158 sc0-01.cse.nd.edu bxgridtest/7/5/1080249_2347158.NEF
    imported 3 recording into temp_collectionid 1253734194 faces_still table
    

    Option -D will give you more detail about ingesting process such as checksum of data file, location of replicas,.. And more important, the unique temporary collectionid (1253734194) in this case. The temporary collectionid is an unique number for each ingestion batch. You can use this to assgign validation task, or remove a bad batch using delete operation.

    In order to convert old one line per recording format to new name value pair format, please use a perl script here:

    /afs/nd.edu/user37/software/bxgrid/script/l2nvp.pl
    Usage:
    l2nvp.pl <datatype> <metadata file> <new metadata file>
    For example: [hbui@cclws01]$ l2nvp.pl irises_still 200807-irisstills.datafile test.dat

  2. delete: Use delete to remove data from BXGrid.

  3. audit: Use audit to check data integrity.

  4. repair: Use repair after audit to correct inconsistent data.