Scripts Documentation

From SpeechWiki

(Difference between revisions)
Jump to: navigation, search
(more clarifications for documenting and usage)
m
 
(4 intermediate revisions not shown)
Line 1: Line 1:
-
=SVN And `Official' public-use location for Scripts=
+
{{TOClimit|3}}
-
All the python and perl scripts are in svn://mickey/scripts.
+
==Where to find documentation==
 +
 
 +
===Via the web===
 +
The high-level descriptions of (some of) the scripts is [[High Level Scripts Documentation|here]].
 +
 
 +
The auto-generated documentation is available for
 +
* perl [http://mickey.ifp.uiuc.edu/speech/akantor/fisher/scripts/doc/perl/html/files.html  scripts] and [http://mickey.ifp.uiuc.edu/speech/akantor/fisher/scripts/doc/perl/html/annotated.html modules]
 +
* python [http://mickey.ifp.uiuc.edu/speech/akantor/fisher/scripts/doc/python/ scripts and modules].
 +
 
 +
===Via the IFP network===
 +
The auto-generated documentation is in:
 +
/cworkspace/ifp-32-1/hasegawa/programs/scripts/doc
 +
 
 +
====The location of scripts themselves====
 +
The scripts themselves are all in svn://mickey/scripts.
A version of SVN is checked out into /cworkspace/ifp-32-1/hasegawa/programs/scripts on the ifp-32 cluster.
A version of SVN is checked out into /cworkspace/ifp-32-1/hasegawa/programs/scripts on the ifp-32 cluster.
Line 10: Line 24:
http://mickey.ifp.uiuc.edu/speech/akantor/fisher/scripts/
http://mickey.ifp.uiuc.edu/speech/akantor/fisher/scripts/
-
=Documentation=
+
==Instructions for contributors ==
 +
===How to auto-document the scripts===
 +
# Make sure your path for perl and python is set up so you can run the scripts
 +
#Do:
 +
cd /cworkspace/ifp-32-1/hasegawa/programs/scripts/doc
 +
./makeDoc.sh
-
==Locations==
+
You need doxygen, doxygenfilter, epydoc installed, and the doxygen needs the patch in <code>/cworkspace/ifp-32-1/hasegawa/programs/scripts/doc/makeDoc.conf/PerlFilter.pm</code>
-
Some documentation is automatically generated from the the source code comments.
+
===How to comment your scripts for auto-documentation===
-
The autogenerated doc is not in SVN, but can (and should) be regularly recreated with
+
-
/cworkspace/ifp-32-1/hasegawa/programs/scripts/doc/makeDoc.sh This creates the docs for both
+
-
perl and python. The autogenerated docs are placed in
+
-
/cworkspace/ifp-32-1/hasegawa/programs/scripts/doc/perl and
+
-
/cworkspace/ifp-32-1/hasegawa/programs/scripts/doc/python
+
-
 
+
-
==How to write comments meaningful for the auto doc generation==
+
In both perl and java you can write comments using the [http://java.sun.com/j2se/javadoc/writingdoccomments/index.html javadoc] conventions (more precisely, [http://epydoc.sourceforge.net/ epydoc] for python and [http://www.stack.nl/~dimitri/doxygen/docblocks.html doxygen] for perl, although both should be supersets of javadoc)
In both perl and java you can write comments using the [http://java.sun.com/j2se/javadoc/writingdoccomments/index.html javadoc] conventions (more precisely, [http://epydoc.sourceforge.net/ epydoc] for python and [http://www.stack.nl/~dimitri/doxygen/docblocks.html doxygen] for perl, although both should be supersets of javadoc)
Line 28: Line 40:
the script with the --help option.
the script with the --help option.
-
===python===
+
====python====
Python documentation is pretty well taken care of with epydoc, with the command-line
Python documentation is pretty well taken care of with epydoc, with the command-line
--help documentation and the web documentation being generated from the
--help documentation and the web documentation being generated from the
Line 65: Line 77:
"""
"""
-
===perl===
+
====perl====
Perl documentation is generated with doxygen and a [http://www.bigsister.ch/doxygenfilter doxygenfilter] script modified to generate --help usage.   
Perl documentation is generated with doxygen and a [http://www.bigsister.ch/doxygenfilter doxygenfilter] script modified to generate --help usage.   
Line 87: Line 99:
}
}
</pre>
</pre>
-
 
-
=Software=
 
-
The following modules can be used independently of each other (The
 
-
GMTK scripts use all of them and can be used as examples):
 
-
 
-
==Python modules==
 
-
 
-
;<code>gmtkParam</code>
 
-
:A complete library for reading/writing/manipulating GMTK
 
-
parameter files.  It can for instance read/write my master file and
 
-
trainableParams files.
 
-
 
-
==Perl modules==
 
-
 
-
;<code>AI::GMTK</code>
 
-
:for parallel fault-tolerant single iteration training, training to convergence and viterbi.
 
-
 
-
;<code>Config::OptionsSet::OptionsSet.pm</code>
 
-
:for reading/writing/displaying sets of options (perl dictionaries at their simplest)
 
-
 
-
;<code>Config::OptionsSet::Grid.pm</code>
 
-
:compactly representing sets of options that differ by a few parameters, e.g. when tuning over a particular parameter
 
-
 
-
;<code>Getopt::Lazier.pm</code>
 
-
:A command-line options parser based on <code>Getopt::Long</code>, which also generates pretty documentation for the options, and does simple validation on the options. (e.g. is the option required, or must the option specify and existing file or dir)
 
-
 
-
;<code>OS::Util.pm</code>
 
-
:simple utility functions nothing really interesting here
 
-
 
-
;<code>Parallel::Distribute.pm</code> 
 
-
:Used to submit a set of tasks to the SGE cluster and wait for them all to finish, returning an error if any one of them returns an error
 
-
 
-
The module names are chosen so that we can submit them to CPAN without
 
-
major changes.
 
-
 
-
==python/perl Scripts==
 
-
 
-
<code>AI::GMTK::*</code> and <code>Parallel::Distribute.pm</code> have drivers in
 
-
scripts/gmtk/ (emConvergeParallel.pl  emTrainParallel.pl viterbiParallel.pl)
 
-
and scripts/parallel/distribute.pl
 
-
 
-
There is also a script to quickly mirror the data to the scratch space
 
-
of all compute nodes with bittorrent: scripts/parallel/mirrorScratch.py
 
-
and scripts for generating aditional files needed for GMTK in scripts/gmtk/
 
-
 
-
Most scripts will give decent usage help if called with --help, but the
 
-
actual source code documentation is a bit sparse.
 

Latest revision as of 19:05, 19 May 2010

Where to find documentation

Via the web

The high-level descriptions of (some of) the scripts is here.

The auto-generated documentation is available for

Via the IFP network

The auto-generated documentation is in: /cworkspace/ifp-32-1/hasegawa/programs/scripts/doc

The location of scripts themselves

The scripts themselves are all in svn://mickey/scripts.

A version of SVN is checked out into /cworkspace/ifp-32-1/hasegawa/programs/scripts on the ifp-32 cluster. It is intended to always be in usable state, and our SST group should use it from there. The readme.html file there explains the structure.

A web view onto the official version is temporarily at http://mickey.ifp.uiuc.edu/speech/akantor/fisher/scripts/

Instructions for contributors

How to auto-document the scripts

  1. Make sure your path for perl and python is set up so you can run the scripts
  2. Do:
cd /cworkspace/ifp-32-1/hasegawa/programs/scripts/doc
./makeDoc.sh

You need doxygen, doxygenfilter, epydoc installed, and the doxygen needs the patch in /cworkspace/ifp-32-1/hasegawa/programs/scripts/doc/makeDoc.conf/PerlFilter.pm

How to comment your scripts for auto-documentation

In both perl and java you can write comments using the javadoc conventions (more precisely, epydoc for python and doxygen for perl, although both should be supersets of javadoc)

Both perl and python documentation additionally include the usage text that should display when one runs the script with the --help option.

python

Python documentation is pretty well taken care of with epydoc, with the command-line --help documentation and the web documentation being generated from the same place, with some minimal requirements for the way people write their scripts. They can follow examples of the existing ones. Basically, you need a file docstring to contain somewhere a %InsertOptionParserUsage% string which will be replaced by usage documentation

"""
%InsertOptionParserUsage%

The rest of the file documentation...
@author ...
@see ...
"""

and also you need to augment the __doc__ string with the usage info whenever the file is interpreted by python.

Putting the following at the end of the file works:

#the parser is used for generating documentation, so create it always, and augment __doc__ with usage info  
#This messes up epydoc a little, but allows us to keep a single version of documentation for all purposes
parser = makeParser()
__doc__ = __doc__.replace("%InsertOptionParserUsage%\n", parser.format_help())


if __name__ == "__main__":
	
	main(sys.argv)

If you don't do the above, the documentation will be generated without the --help usage. """

perl

Perl documentation is generated with doxygen and a doxygenfilter script modified to generate --help usage.

Comments used for doc generation should have the first line start with a ##, like this:

## @file 
# Based on genPhonePhonePos2WholePhoneStateDTs.pl from the gmtk Aurora tutorial

If a comment with @file directive is present (as above). The documentation is associated with the file. In this case, doxygenfilter simply runs the file with the --help option, file.py --help and includes the output with the documentation.

Admittedly this is a little dangerous, but I (Arthur) tried to do this quickly, so anyone is welcome to improve this.

Also note that perl does not really have named arguments, so doxygenfilter actually tries to parse the code for common assignment of the argument list to vars (e.g. my ($arg1, arg2) = @_;), and generates the argument names from there. You can of course specify the arguments in the documentation too:

## @fn private void debug(@args)
# A simple function for debugging. Prints the arguments to STDERR.
# @param args The stuff to be printed.
sub debug {
    my(@args) = @_;
}
Personal tools