How to Write BDP’s and AT’s

Basic Examples

Below are basic (Hello World level) examples of how to write a BDP and AT. This is followed by documentation on how to incorporate new BDPs and ATs into the ADMIT system. The following section shows more complex examples. Each example file can be located in admit/doc/examples if you want a closer inspection or to play with them.

HelloWorld_BDP

The Basic Data Product (BDP) is the basic unit of data storage in Admit. Each BDP is specified in a single file in admit/bdp. The file contains a single class which holds the data. Every BDP must inherit from the BDP base class, or one or more other BDPs. There should be only a minimal number of methods defined in each BDP as they are not meant to do much if any processing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
"""Hello World BDP
   ---------------

   This module defines the HelloWorld_BDP class.
"""
from admit.bdp.BDP import BDP

class HelloWorld_BDP(BDP):
    """ An example BDP class.

        Parameters
        ----------
        xmlFile : string, optional
            Basename of the file where the BDP data will be stored on write

        Attributes
        ----------
        yourname : string
            The name of the person in the BDP

        planet : string
            Then name of the home planet

        star : string
            The name of the home star

    """
    def __init__(self,xmlFile=None,**keyval):
        BDP.__init__(self,xmlFile)
        self.yourname = ""
        self.planet = ""
        self.star = ""
        self.setkey(keyval)
Line # Notes
1-5 Initial docstring documentation. See the secition on Documentation Style Guide for details.
6 Required import statement – the class that HelloWorld_BDP inherits from, others (e.g. Line_BDP, Image_BDP, & Table_BDP) can also be used. Any other needed includes should be added here.
8 The BDP class name with inheritance. The class name must match the file name (except for the .py), in this example the file is called HelloWorld_BDP.py. Also be sure to include any inheritance.
9-27 Notes and documentation for the class. See the section on Documentation Style Guide for details.
28 The __init__ function signature. The method takes only 3 arguments: self, xmlfile, and keyval, and no others. This is so that any BDP can be initialized with empty parentheses (e.g. a = HelloWorld_BDP()).
29 Initialize the base class or any other parent classes.
30-32 Initialization of attributes. Any attributes that need to be saved when the BDP is written to disk must be initialized here along with default values. The default values must be of the same type of data that the attribute will hold. Attributes can be of any python basic data type (integer, long, float, string, boolean, list, dictionary, set, and tuple) and numpy arrays. To store tables, molecular or atomic data, or images, inherit from Table_BDP, Line_BDP, or Image_BDP respectively. The following attribute names are reserved and cannot be used: project, sous, gous, mous, _date, _type, xmlFile _updated, _taskid, _usedby, uid, and alias. Those that start with an “_” should not be modified by code in a BDP or AT, they are system variables that are set and maintained in the background. The following method names are reserved and cannot be overloaded: getfiles, show, depends_on, report, set, get, setkeys, write, and delete.
33 This method call sets any given key-value pairs passed in via keyval and must be after the parent class initialization and attribute definition.

HelloWorld_AT

The Admit Task (AT) is the basic unit of processing in Admit. All AT’s have the same __init__ signature and must inherit from the AT base class, or one or more other AT’s.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
"""HelloWorld
   ----------

   Defines the HelloWorld class.
"""
from admit.AT import AT
from admit.bdp.HelloWorld_BDP import HelloWorld_BDP

class HelloWorld_AT(AT):
    """ Basic example of an ADMIT Task

        Parameters
        ----------
        keyval : keyword - value pairs passed to the constructor for ease
            of assignment.

        Attributes
        ----------
        None
    """
    def __init__(self,**keyval):
        # set the key words up with default values
        keys = {"yourname": "",
                "planet"  : ""}
        AT.__init__(self,keys,keyval)
        self._version = "1.0.0"
        self.set_bdp_out([(HelloWorld_BDP,1)])

    def run(self):
        # Destroy old output BDPs before rebuilding the output BDP list.
        # Omit this call if you will reuse existing BDPs.
        self.clearoutput()

        filename = "test.txt"
        hw = HelloWorld_BDP()
        hw.yourname = self.getkey("yourname")
        hw.planet = self.getkey("planet")
        if self.getkey("planet") == "Earth" :
            hw.star = "Sol"
        else:
            hw.star = "Unknown"
        hw.favoritefile = filename
        self.addoutput(hw)
Line # Notes
1-5 Initial docstring documentation. See the secition on Documentation Style Guide for details.
6 Required import string. Most ATs will inherit only from the base AT class. See the Complex AT for information on how to inherit from any other AT.
7 Any BDPs that are used (either as inputs or outputs) by the AT should be imported next.
9 The AT class name with inheritance. As with the BDPs, the AT class name must match the name of the file it is defined in (minus the .py extension, in this case HelloWorld_AT.py.
10-20 Notes and documentation for the class. See the section on Documentation Style Guide for details.
21 The __init__ function signature. The method takes only 2 arguments: self and keyval and no others. This is so that any AT can be initialized with empty parentheses (e.g. a = HelloWorld_AT()).
23-24 Define all kewords and assign defualt values. Unlike BDPs the AT keywords are kept in a single disctionary where the key is the keyword for the dictionary entry. As with BDPs every keyword and attribute must have a defualt value of the same type as the data are expected to hold. Any data stored in the keyword dictionary or defined in the __init__ method will be saved to disk. Attributes can be of any python basic data type (integer, long, float, string, boolean, list, dictionary, set, and tuple) and numpy arrays. The following attribute names are reserved and cannot be used: _stale, _enabled, _do_polt, _plot_mode, _type, _bdp_out, _bdp_out_map, _bdp_in, _bdp_in_map, _valid_bdp_in, _valid_bdp_out, _taskid, _version, and _needToSave. Reserved method names are: markUpToDate, markChanged, uptodate, isstale, show, check, setkey, getkey, haskey, checktype, addoutput, addinput, clearinputs, clearoutputs, getVersion, getdtd, write, save, execute, checkfiles, copy, and validateinput.
25 Initialize the parent class, passing both the dictionary of keywords and values and the input keyval argument.
26 Set some of the static attributes, in this case the version.
27 Set the _valid_bdp_in, and/or _valid_bdp_out. See the following sections for a description of how to use _valid_bdp_in and _valid_bdp_out.
29 Define the run method, which takes no arguments except self. Every AT should implement the run method as this is the method that does all of the work of the AT.
32 Clear the output BDP array. If this call is not made then any produced BDPs will be added to the existing list, until slots run out.
35 Initialize the HelloWorld BDP which is the output of this AT.
36-38 Set some of the BDP values based on attributes in the AT. THe use of getkey is the only supported method of getting the value of an AT keyword. Similarly setkey is the only supported method of changing the value of a keyword.
43 Add the BDP to the AT output list. Any BDPs that are the products of the AT must be added to the AT with this call.

Adding New BDPs and ATs to ADMIT

New BDP files need to be placed in admit/bdp and new AT files need to be placed in admit/at. Once they are in place run the dtdGenerator (Note: need to decide how we want this to be called in the end). Also if the attributes of any BDP or AT are changed the dtdGenerator should be run. Now go ahead and start using your new BDP/AT in ADMIT.

Note: needs more

Writing Unit and Integration Tests

For each AT there should be two test programs written: a unit test and an integration test. Below are

Unit Test

The purpose of the unit test is to test the funtionality of the funtionality of the AT itself and need not have much dependence on outside systems. The test should test as much of the AT code as possible. Here is the unit test for the HelloWorld AT:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
#! /usr/bin/env casarun
#
#
#   you can either use the "import" method from within casapy
#   or use the casarun shortcut to run this from a unix shell
#   with the argument being the casa image file to be processed
#
""" Right now you need to run this test inside of casapy

This test does the following:
    creates an admit class
    creates a helloworld AT
    sets some helloworld parameters
    adds the helloworld AT to the admit class
    runs admit (which in turn runs the needed AT's)
    writes the results out to disk
    reads them into a new admit instance
    prints out one of the BDP xml file names

    to run this test do the following:
        import admit.at.test.test_helloworld as th
        th.run()
"""
import admit.Admit as ad
import admit.at.HelloWorld_AT as hw
import admit.at.Ingest_AT as ia
import admit.util.bdp_types as bt

def run():
    # instantiate the class
    a = ad.Admit()

    # instantiate a moment AT
    h = hw.HelloWorld_AT()
    # add the moment AT to the admit class
    a.addtask(h)
    # set some moment parameters
    h.setkey("yourname","Bill")
    h.setkey("planet","Mars")

    # run admit (specifically the tasks that need it)
    h.execute()
    # save it out to disk (this will not be needed soon as I a working on
    # a way to write out the xml inside of the run commmand
    a.write()

    a2 = ad.Admit()   # read in the admit.xml and bdp.xml files

    print "These pairs should match"
    for at in a.fm:
        print "FlowManager task ",a.fm[at]
        print "FlowManager task ",a2.fm[at]
        print "LEN ",len(a.fm[at]._bdp_out)
        print "LEN ",len(a2.fm[at]._bdp_out)
        print "Input ",a.fm[at]._bdp_in[0]._taskid
        print "Input ",a2.fm[at]._bdp_in[0]._taskid
        print "\n\n"


    print "Conn map ",a.fm._connmap
    print "Conn map ",a2.fm._connmap
    print "\n\n"

    print "Conn map ",a.fm._depsmap
    print "Conn map ",a2.fm._depsmap
    print "\n\n"

    for at in a.fm:
        for i in a.fm[at].bdp_out :
            if(i.xmlFile == a2.fm[at]._bdp_out[0]._xmlFile):
                print "File ",i.xmlFile
                print "File ",a2.fm[at]._bdp_out[0]._xmlFile
                print "\n\nPASS\n"
                return
    print "\n\nFAIL\n"

if __name__ == "__main__":
    import sys

    argv = ad.casa_argv(sys.argv)
    if len(argv) > 1:
        print "Working on ",argv[1]
        run(argv[1])

Integration Test

The purpose of the integration test is to test the AT inside of the ADMIT system.Here is the integration test for HelloWorld AT:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
#! /usr/bin/env casarun
#
#
#   you can either use the "import" method from within casapy
#   or use the casarun shortcut to run this from a unix shell
#   with the argument being the casa image file to be processed
#
""" Right now you need to run this test inside of casapy

This test does the following:
    creates an admit class
    creates a helloworld AT
    sets some parameters
    adds the helloworld AT to the admit class
    runs admit (which in turn runs the needed AT's)
    writes the results out to disk
    reads them into a new admit instance
    prints out one of the BDP xml file names

"""
import admit.Admit as ad
import admit.at.HelloWorld_AT as hwa
import admit.bdp.HelloWorld_BDP as hw
import admit.util.bdp_types as bt
import unittest

class IntegTestHelloWorldAT(unittest.TestCase):

    def setUp(self):
        self.inputFile  = "mom_integ_test_input.fits"
        self.outputFile = "mom_integ_test_output"
        self.foobar()

    def tearDown(self):
        self.cleanup()

    def cleanup(self):
        pass
    def foobar(self):
        print "Cleaning up...\n"
        try:
            cmd = "/bin/rm -f %s.helloworld_*.bdp" % self.inputFile
            os.system( cmd )
        except Exception as ex:
            print "failed to remove %s files " % self.inputFile
            print ex
            #pass
        try:
            os.system("/bin/rm -rf ipython*.log")
        except:
            print "failed to remove ipython logs"
            #pass
        try:
            os.system("/bin/rm -rf casapy*.log")
        except:
            print "failed to remove casapy logs"
            #pass
        try:
            os.remove("admit.xml")
        except:
            print "failed to remove admit.xml"
            #pass

    # Call the main method runTest() for automatic running.
    #
    #
    def runTest(self):

        # instantiate the class
        a = ad.Admit()
        a.pmode = -1 # no plotting, please

        #fitsin = ia.Ingest_AT(file=self.inputFile)
        #fitsin.setkey('file',self.inputFile)
        #fitsin.setkey('symlink',True)
        #task0id = a.addtask(fitsin)

        # instantiate hello world at
        print "########### instantiate a helloworld AT ##############"
        h = hwa.HelloWorld_AT(yourname="Bob")

        print "########### add helloworld task ##############"
        task1id = a.addtask(h)

        # check the fm
        print "########### fm.verify ##############"
        a.fm.verify()


        if True:
            print "########### admit.run ##############"
            # run admit
            a.run()
            # save it out to disk.
            a.write()

            a2 = ad.Admit()   # read in the admit.xml and bdp files

            print "========"
            print "FM in memory"
            a.fm.show()
            print "FM read in"
            a2.fm.show()
            print "========"
            self.assertEqual(len(a.fm),len(a2.fm))
            for atask in a.fm:
                self.assertEqual(len(a.fm[atask]._bdp_out),
                                 len(a2.fm[atask]._bdp_out))
                if(len(a.fm[atask]._bdp_in) != 0 and len(a2.fm[atask]._bdp_in) != 0):
                    self.assertEqual( a.fm[atask]._bdp_in[0]._taskid,
                                      a2.fm[atask]._bdp_in[0]._taskid)


            self.assertEqual(a.fm._connmap,a2.fm._connmap)

            for at in a.fm:
                for i in range(len(a.fm[at]._bdp_out)) :
                    #print "%d %d %d %s s\n" % (i ,
                    print "%d %d %d %s %s\n" % (i ,
                             a.fm[at]._bdp_out[i]._taskid,
                            a2.fm[at]._bdp_out[i]._taskid,
                             a.fm[at]._bdp_out[i].xmlFile,
                            a2.fm[at]._bdp_out[i].xmlFile )

###############################################################################
# END CLASS                                                                   #
###############################################################################

suite = unittest.TestLoader().loadTestsFromTestCase(IntegTestHelloWorldAT)
unittest.TextTestRunner(verbosity=0).run(suite)

Some Inner Workings You Should Know

The bdp_types.py File

Since Python does not directly support enumerated types the ADMIT system has chosen to use static strings in order to facilitate comparisons of BDP and AT types and other internal data management.

Note: needs more.

The valid_bdp_in Attribute

In order for the ADMIT system to know what BDPs an AT will take as input, each AT will specify the numbers and types of BDPs it expects in the _valid_bdp_in attribute. The _valid_bdp_in attribute is a list of tuples, where each tuple specifies one type of input. The tuple consists of 3 parts: BDP type, number, and whther they are required or optional. Lets take the following examples:

(Moment_BDP,1,bt.REQUIRED)

(LineList_BDP,0,bt.OPTIONAL)

The first example states that 1 Moment_BDP (or any BDP that inherits from Moment_BDP) is required as input, the second states the zero or more LineList_BDPs (or any BDP that inherits from LineList_BDP) are optional inputs. To combine them together one would write:

self.set_bdp_in([(Moment_BDP,  1, bt.REQUIRED),
                 (LineList_BDP,0, bt.OPTIONAL)])

Order is important when specifying the _valid_bdp_in, the Flow Manager will fill in the inputs based on what order they appear, and optional BDPs must come after all required BDPs. If one wants to specify that one or more of a BDP type can be an input it can be written as:

self.set_bdp_in([(Moment_BDP, 1, bt.REQUIRED),
                 (Moment_BDP, 0, bt.OPTIONAL)])

Note: may need more work.

The valid_bdp_out Attribute

Similar to the _valid_bdp_in attribute, the _valid_bdp_out attribute is used to tell the Flow Manager what type(s) of BDPs to be produced by each AT. The _valid_bdp_out is a list of tuples, where each tuple contains the type of BDP and the number to expect. For example:

(Moment_BDP,2)

(SpwCube_BDP,0)

The first example states that 2 Moment_BDPs will be the output, and the second states that zero or more SpwCube_BDPs will be output. As with the _valid_bdp_in these can be combined:

self.set_bdp_out([(Moment_BDP, 2),
                  (SpwCube_BDP,0)])

Order is important, so in this example the first two BDPs in the output BDPs will be of type Moment_BDP and any others will be of type SpwCube_BDP.

Using Tables, Images, and Lines

This section shows you how to use the Table, Image, and Line BDPs and their underlying classes.

Table_BDP

The Table_BDP (design, api) is essentially a BDP wrapper for the Table (design, api) class. The Table_BDP class has a single non-inherited attribute: table, which is an instance of the Table class. The Table class has the following attributes:

columns A List containing the names of the columns, columns can be retrieved by name or by column number (0 based index).
units A List containing the units for the columns.
planes A List continaing the names for each plane (3D only) in the table.
description A string for a description or caption of the table.
data A NumPy array of the actual data, can contain a mix of types.

Any of the attributes can be set via the constructor or via the set command. Below is an example of constructing a Table_BDP.

1
2
3
4
5
6
7
8
9
cols = ["Number","Square"]
units = [None,None]
dat = np.array([[1,1],[2,4],[3,9]])
desc = "Numbers and their square"
tbl = Table_BDP()
tbl.table.setkey("columns",cols)
tbl.table.setkey("units",units)
tbl.table.setkey("data",dat)
tbl.table.setkey("description",desc)
Line # Notes
1 Create the column labels
2 Create the units, in this case there are none, but we set them for completeness.
3 Create the numpy array for the data, in many cases this may come from one or more external tasks.
4 Create the description of the table.
5 Instantiate a Table_BDP.
6-9 Set the individual data members for the table.

In this simple Table example the data members were created and then set individually in the Table_BDP class. There is no checking for the number of columns to match either the length of the column or units lists as they can be set in any order and will not match until they are all set. Below is a more complex example where the data columns are of different data types.

1
2
3
4
5
6
7
cols = ["Atom","# electrons"]
desc = "Listing of the atoms and number of electrons."
atoms = np.array(["Hydrogen","Helium","Oxygen","Nitrogen"])
electrons = np.array([1,2,8,7])
edata = np.column_stack((atoms.astype("object"),electrons))
table = Table(columns=cols,description=desc,data=edata)
tbdp = Table_BDP(table=table)
Line # Notes
1 Create the column labels
2 Create the description of the table
3-4 Create columns of data, these may come from extrnal tasks.
5 Stack the columns together to create a single 2D array. Note that the first column (atoms) has the addition of .astype(“object”). This is necessary if you are creating a table with different data types (strings and ints in this case), as it will preserve the data type in the array. Otherwise all columns will be converted to the lowest possible type (strings in this case).
6 Create a Table with the given parameters.
7 Create a Table_BDP with the given table as its contents.

Image_BDP

Line_BDP

More Complex Examples

This section presents more complex examples of BDPs and ATs, including how to inherit from other BDP types.

Complex BDP

Complex AT