| Data Acquisition and Online Analysis at the NSCL | ||
|---|---|---|
| Prev | Chapter 5. Best Practices for Building Software | Next |
Thing change. Software is no exception. In fact, the very nature of software ensures that it can and will change. We want to be sure that when we do need to change our software:
We can accomplish that change easily
We can accomplish that change reliably.
I will first how to isolate change in the various environments we deal with in software development for nscldaq, SpecTcl and device support. Once we know how to manage change within each of these domains, we will naturally want to know how to isolate and manage change for definitions that span domain boundaries.
Let's start by considering the following simple Shell script.
Example 5-1. Shell script without any change isolation
!#/bin/bash
/usr/opt/daq/7.4/Scripts/ReadoutShell.tcl spdaq22.nscl.msu.edu \
/user/fox/daq/Readout/Readout \
events.nscl.msu.edu
This script starts up the Readout GUI with the readout program /home/fox/daq/Readout/Readout running on spdaq22.nscl.msu.edu. Event recording will occur via ftp on events.nscl.msu.edu. If you have read the previous chapter, you noticed we made no effort to kill off old Readout programs. That was a deliberate omission for the sake of simplicity.
If you have thought about isolating changeable items from unchanging items, I'm sure that several obvious issues are jumping right out of the page at you.
You may need to change where the Readout program runs due to a computer upgrade, a failure during the experiment, or a difference between your development environment and running environment.
You may need to change where the event data are recorded as the NSCL configuration changes.
The first idea that suggests itself is to pull the definitions of these two items out of the command for ReadoutShell.tcl:
Example 5-2. Isolating the nodes
#!/bin/bash
# Edit the configuration below if needed:
DAQHOST="spdaq22.nscl.msu.edu"
EVTHOST="events.nscl.msu.edu"
/usr/opt/daq/7.4/Scripts/ReadoutShell.tcl ${DAQHOST} \
/user/fox/daq/Readout/Readout \
${EVTHOST}
In this script, a simple edit to the lines at the top of the file can allow either the data taking, recording or both systems to be modifed. What about the Readout program. We may want to use someone else's readout program. They may locate theirs in a different part of the directory tree than we do, and they may name it differently. In fact, as we will see later, the readout program may live in a project directory that is not associated with a specific Unix user, but with a collaboration of users that maintain this software as a team.
From the discussion above, it seems that the Readout program path consists of three elements. The base directory, that is the user home directory, or the project directory, the subdirectory within this path, and the name of the file in this subdirectory. Our next set of definitions will reflect this:
Example 5-3. Abstracting Readout File name elements.
#!/bin/bash
DAQHOST= "spdaq22.nscl.msu.edu"
EVTHOST= "events.nscl.msu.edu"
EXPSWROOT= "/user/fox/"
READOUTSUBDIR= "daq/Readout/"
READOUTFILENAME= "Readout"
RDOFILE= ${EXPSWROOT}${READOUTSUBDIR}${READOUTFILENAME}
/usr/opt/daq/7.4/Scripts/ReadoutShell.tcl ${DAQHOST} \
${RDOFILE} \
${EVTHOST}
It looks like we have change management well in hand for this script, but I want to raise two other issues. The first is that we may change versions of the NSCLDAQ software at some later time, perhaps we want to move to the current version (8.0). We may also want to use someone else's Readout GUI if something like that emerges. We therefore will want to do the same sort of thing with the path to ReadoutShell.tcl that we did with the Readout path. In this case, we must recognize that the path consists of: A root for the nscldaq installation, an nscldaq version, a subdirectory, and the name of the GUI script. This gives:
Example 5-4. Abstracting the Readout GUI Shell Script
#!/bin/bash
DAQHOST= "spdaq22.nscl.msu.edu"
EVTHOST= "events.nscl.msu.edu"
EXPSWROOT= "/user/fox/"
READOUTSUBDIR= "daq/Readout/"
READOUTFILENAME= "Readout"
RDOFILE= ${EXPSWROOT}${READOUTSUBDIR}${READOUTFILENAME}
DAQSWROOT= "/usr/opt/"
DAQVERSION= "7.4"
DAQRDOSHELLSUBDIR= "/Scripts/"
DAQRDOSHELLNAME= "ReadoutShell.tcl"
RDOSHELL= ${DAQSWROOT}${DAQVERSION}${DAQRDOSHELLSUBDIR}${DAQRDOSHELLNAME}
${RDOSHELL} ${DAQHOST} \
${RDOFILE} \
${EVTHOST}
The second consideration I promised you was that many of these definitions are useful to other shell scripts. One important way to keep things to consistent is to ensure that anything you do is done only once. Don't duplicate code or data without a very good reason for it. The definitions for DAQHOST, EVTHOST, EXPSWROOT, DAQSWROOT and DAQVERSION are definitions that deserve to be factored out of this file. In software, as with arithmetic, factorization is the process of 'centralizing' comonality.
With shell scripts the tools we have for factorization are to create shell and environment variables as we have already done, and to centralize these definitions in additional scripts. This organization is reflected in the next example, which shows the partial contents of .bashrc, partial contents of a new shell script called expconfig.sh which will contain the definitions of environment variables we'll use for the experiment, and the modified godaq script we have been working with.
Example 5-5. Factoring Definitions into expconfig.sh
The .bashrc file excerpt.
..
. ${HOME}/expconfig.sh
..
The expconfig.sh excerpt.
...
DAQHOST= "spdaq22.nscl.msu.edu"
EVTHOST= "events.nscl.msu.edu"
EXPSWROOT= "/user/fox/"
DAQSWROOT= "/usr/opt/"
DAQVERSION= "7.4"
export DAQHOST EVTHOST EXPSWROOT DAQSWROOT DAQVERSION
...
The final godaq script.
#!/bin/bash
. ${HOME}/expconfig.sh
READOUTSUBDIR= "daq/Readout/"
READOUTFILENAME= "Readout"
RDOFILE= ${EXPSWROOT}${READOUTSUBDIR}${READOUTFILENAME}
DAQRDOSHELLSUBDIR= "/Scripts/"
DAQRDOSHELLNAME= "ReadoutShell.tcl"
RDOSHELL= ${DAQSWROOT}${DAQVERSION}${DAQRDOSHELLSUBDIR}${DAQRDOSHELLNAME}
${RDOSHELL} ${DAQHOST} \
${RDOFILE} \
${EVTHOST}
Now all the changes we have anticipated making are in the expconfig.sh, if they are system wide, or the godaq file if we are confident that their scope will be confined to that script.
The concept of scope is very important
for configuration management. Things that are changable generally have
some domain of affect. For example, changing DAQHOST
affects any program that has to do with getting online data. It is said to
have system wide scope. On the other hand, the
variables defined in godaq only affect
how godaq itself operates. These variables are said
to be application scoped (the application is the godaq
script). In other cases, not (yet?) shown, there may be configurable
items that affect several related applications. These are said to have
subsystem scope where we define subsystem to be
a set of applications that work together to accomplish a cohesive aim.
(for example all the control applications in a device).
Tcl scripts have many of the same issues that shell scripts have. Consider the following small script that might be part of an add-on for SpecTcl.
Example 5-6. Prompting for an Event File
proc getEventFile {} {
return [tk_getOpenFile -title "Select an Event File" \
-defaultextension .evt \
-filetypes {{{Event Files} .evt} \
{{All Files} *}} \
-initialdir ~/stagearea/complete]
}
This script prompts for an event file by using a file chooser dialog. Event files, by default will have the file type .evt and will be located in ~/stagearea/complete. If the user selects a file from the chooser, it will be returned to the caller. If the user cancels, an empty string is returned. This proc could be part of an addition to the SpecTcl GUI that opens event files for replay.
What are the things in this event file that might change? Consider what happens as you move from analyzing files during data taking to offline analysis after the experiment is over. When the experiment is over, users at the NSCL must backup their event data to tape, reload the event data into file servers avaialable to data analysis systems and, after verifying the success of this operation, remove the data from the acquisition file serers.
While this system seems cumbersome it:
Ensures that experimeters have created a valid backup tape of the experiment primary data. This is good practice even if there were not a requirement from the NSF to retain data in permanent form for a number of years after the data are taken.
Puts the data into a segment of the lab's network that's isolated from data taking traffice
Puts the data onto file servers with much greater capacity than those available on the data taking part of the network.
Makes the event space used during the experiment available for future experiments.
In any event, the changeable item is the directory in which the GUI initially looks for event files. This is different during data taking from what it will be after data taking. Our first cut at managing this looks like this>
Example 5-7. Makeing a global configuration parameter
set eventFileDirectory "~/stagearea/complete"
proc getEventFile {} {
global eventFileDirectory
return [tk_getOpenFile -title "Select an Event File" \
-defaultextension .evt \
-filetypes {{{Event Files} .evt} \
{{All Files} *}} \
-initialdir $eventFileDirectory]
}
For this simple example, this is sufficient. Suppose, however this is part of a very large system. In that case the number of configurable parameters may be quite large. Individual global variables for configuration information is not very scalable. Looking at the source code, how can you be sure which parameters are configuration variables and which are just 'ordinary' global variables?
Accepted Tcl practice is to put the configuration of Tcl applications into an array. Remember, in Tcl, array indices are strings. This practice calls for making the name of the configuration item, the array index for the item. For example:
Example 5-8. Using a configuration array:
set configuration(something) "aconfigvalue"
set configuration(eventFileDirectory) "~/stagearea/complete"
set configuration(somethingElse) "somevalue"
...
proc getEventFile {} {
global configuration
set eventFileDirectory $configuration(eventFileDirectory)
return [tk_getOpenFile -title "Select an Event File" \
-defaultextension .evt \
-filetypes {{{Event Files} .evt} \
{{All Files} *}} \
-initialdir $eventFileDirectory]
}
Now all configuration items can be factored into a single section of code that loads the configuration array. As with our examples in scripting, however, the location of event files may be an item of interest to other parts of the system. The eventFileDirectory, in fact, should have global scope. Therefore we'll split this up into two files; one that only contains configuration settings and the other the application that includes our proc.
Example 5-9. Putting Tcl script common definitions into a configuration script:
The configuration script excerpt ~/expconfig.tcl: .
...
set configuration(eventFileDirectory) "~/stagearea/complete"
...
The application file:
source ~/expconfig.tcl
proc getEventFile {} {
global configuration
set eventFileDirectory $configuration(eventFileDirectory)
return [tk_getOpenFile -title "Select an Event File" \
-defaultextension .evt \
-filetypes {{{Event Files} .evt} \
{{All Files} *}} \
-initialdir $eventFileDirectory]
}
![]() | If you've thought about why we're doing all this you may have caught on that in ending here I've tried to pull a fast one. There may be other software, that are not Tcl scripts, that will want to get this configuration information. After all system scope should mean everywhere in the system, not just the subset of the system that are Tcl scripts. We will take up building configurations that can be shared between implementation languages later in this chapter. First I want you to be aware of the typical ways each type of software might be configured. |
Typical methods I have seen used to configure C/C++ software at the NSCL include:
Reading configuration files of an application specific format
Creating an interpreter, sourcing a Tcl configuration script and interrogating the interpreter.
Using getenv to retrieve configuration
information from the environment.
Suppose we have a C/C++ program that has implemented a
class named CAdc, which manages a single
set of ADC channels. For this experiment we want to be able to
configure the number of these ADC's and their base addresses.
Consider the following code fragment:
Example 5-10. Selecting number and base addresses of ADC's with code.
/* Edit the following to configure the software: */
const unsigned int nADCS=5; /* # of adcs */
const unsigned long ADCBases[nADCS] = { /* Base addresses */
0x10000, 0x20000, 0x30000, 0x40000, 0x50000};
...
for (int i=0; i < nADCS; i++) {
addAdc(new CADC(ADCBases[i]));
}
...
Each time we want to reconfigure the readout represented by this code fragment, we will need to edit some program file and recompile the program. A much nicer solution would be to create a data file. The data file will consiste of a single count of the number of ADCs, followed by the base addresses of each ADC, one per line. Leaving out error checking, calling this file config.dat:
Example 5-11. Configuring C/C++ with application specific data files
#include <fstream>
...
ifstream file("/user/fox/configuration/config.dat");
unsigned int nADCS;
file >> nADCS;
for (int i =0; i < nADCs; i++) {
unsigned long baseAddress;
file %gt;> baseAddress;
addAdc(new CADC(baseAddress));
}
...
Contents of /user/fox/configuration/config.dat.
5
0x10000
0x20000
0x30000
0x40000
0x50000
has the same effect. Note that since Readout and other NSCLDAQ applications may run in some strange current working directories, we provide an absolute path to the configuration file. Later we'll see how to use environment variables to configure this path.
While the application specific data file described in the previous section does work, there are drawbacks to this approach:
The file is rather hard to understand without looking at the source code, and it can be a programming chore to add enough syntax and grammer to the file to support a more readable file.
This solution is not very good for larger experiments where it would be good to have some looping constructs in the configuration file.
All of the missing elements can be provided by using a Tcl script to configure
the program. Tcl has a relatively easy to use application programming
interface (API). In addition, both SpecTcl and NSCLDAQ provide a class library
that makes using Tcl even easier. Using the best practices described
in the section on configuring Tcl scripts. We'll read the configuration
into an array called config. The element
config(NADCS) will contain the count of adcs while
base(n) will contain the base
address of adc number n. Furthermore, to illustrate
the scalability of this approach, we will assume the base address of each
ADC differs from the previous ADC by 0x10000. This makes the configuration
file:
Example 5-12. A C/C++ configuration file written in Tcl
# Define the ADC count and base addresses.
# config(NADCS) is the number of adcs while
# config(i) is the base address of adc number i.
#
set config(NADCS) 5
set base 0x10000
set offset 0x10000
for {set i 1} {$i <= $config(NADCS)} {incr i} {
set config($i) $base
incr base $offset
}
Now we have a script, we need to use it as a configuration (again omitting error processing):
Example 5-13. Using a Tcl configuration script in C/C++
...
CTCLInterpreter interp;
interp.EvalFile("/user/fox/configuration/config.tcl");
CTCLVariable configArray(&interp, "config");
char* strValue = configArray.Get(TCL_GLOBAL_ONLY, "NADCS");;
long numAdcs = interp.ExprLong(strValue);
for (int i =0; i < numAdcs; i++) {
char indexString[100];
sprintf(indexString, "%d", i);
strValue = configArray.Get(TCL_GLOBAL_ONLY, indexString);
unsigned long base = interp.ExprLong(strValue);
addAdc(new CADC(base));
}
...
See the reference section for more information about the
CTCLInterpreter and CTCLVariable
classes and their APIs.
In both of the previous sections we have had to open a file. In each case, we specified the absolute path to the file (e.g. /user/fox/configuration/config.tcl). This configuration file may also need to be configured. In that way we could have several canned configuration files that we could switch back and forth between for several runs of the program.
Suppose we arrange for an environment variable named READOUTCONFIGFILE to have the full path to the configuration file. In that case we might change the previous example slightly:
Example 5-14. Using environment variables to configure C/C++ programs.
...
CTCLInterpreter interp;
char* configFile = getenv("READOUTCONFIGFILE");
if (configFile) {
interp.EvalFile(configFile);
}
...
In this example, we used the getenv function to retrieve the
value of an evironment variable. getenv will return NULL
if the environment variable does is not defined, or a pointer to the value of
the variable if it is.
So far we have seen methods for configuring shell scripts, Tcl scripts and C/C++ code. These mechanisms are different. This poses problems. Suppose a shell script, a Tcl script and a C++ program all need to gain access to the value of a common configuration parameter. We don't want to define this parameter in three different ways. I we did that, we would certainly have cases where we forgot to change one or more of the configuration parameters, leading to an inconsistent, or even wrong operation of our system.
What we therefore need is a common mechanisms for defining configuration parameters that is available to software regardless of implementation language. Using this, we would only define a configuration parameter once. All software would refer to this single source of configuration information. Nonetheless, we want to use configuration idioms that are natural for each of the programming languages, bash, Tcl and C/C++.
When we consider that C++ programs can use Tcl scripts to configure, the problem reduces to two cases, configuring shell scripts and configuring Tcl scripts. Shell scripts are easily configured via shell and environment variables. Consider the following file, call it ~/daqconfig
Example 5-15. Configuration text file
ConfigDir=~/config
HardwareConfigFile=configfile.tcl
ReadoutConfigFile=readout.tcl
SpecTclConfigFile=spectcl.tcl
Example 5-16. Using ~/daconfig to check the existence of the hardware config file
. ~/daqconfig
if [ -e $ConfigDir/$HardwareConfigFile ] then
...
else
echo Missing hardware configuration file $ConfigDir/$HardwareConfigFile
exit -1
fi
If we can produce a script that can read this file into a Tcl Configuration array we will be done as C++ can use this scheme as well. Consider the following Tcl proc:
Example 5-17. Turning shell variable definition files to Tcl config arrays
proc readConfig {filename var} {
set fd [open $filename r]
set config [read $fd]
close $fd
set lines [split $config "\n"]
foreach line $lines {
set configinfo [split $line =]
set key [lindex $configinfo 0]
set value [lindex $configinfo 1]
if {$key ne ""} {
upvar #0 ${var}($key) target
set target $value
}
}
}
This proc defines the command readConfig. readConfig takes two parameters, the name of a configuration file in key=value form, and the Name of a configuration array to which this configuration file will be merged. Here's how it works:

config local variable.


