Htk-group blog

From SpeechWiki

Revision as of 16:46, 21 June 2006 by Arthur (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Question: train.pl takes too much space====

Hello, everyone,

It seems with the current train.pl (save everything in text format instead of binary format), the working directory can easily get to over 900MB, therefore over the limit of the unix account (the quota is 1GB for me @nibbler). The /data folder is over 300MB and /mmf over 500MB, before finishing clusttering.

The error is something like ERROR [+5010] FClose: closing file failed. The failing HTK command is terminated upon this error.

Have you met that problem when working on the unix machines?

Thanks, Xiaodan

Answer: MH ======================

Ah! You should be able to get more working space by creating a data directory for yourself, e.g., mkdir /home/spot1/xzhuang.

Another possibility would be to output all of the intermediate MMFs in binary, and only keep the last one in each cycle as text. That could be done as follows (modify each of the HERest loops this way):

   $binary = ($i==8) ? "" : "-B";
   system("HERest -A -T 4 -I mlf/unaligned_phones.mlf $occ $binary -C cfg/HERest.cfg -t 250.0 150.0 10000.0 -S scp/train.scp -H $MMF -M mmf/flat1G${i} mmf/monophones.phf");

Another possibility would be to actually delete all of the intermediate MMFs. That could be done by adding this line after the 'system("HERest"' line in each of the HERest loops, but before the redefinition of $MMF. The "if ($i > 1)" condition ensures that the MMF with which you start the loop is spared.

 if ($i > 1) { system("rm $MMF"); }

- Mark

Subdirectory for train.pl ======================

For people who run train.pl, the following command might be necessary before the UPMIXING and TESTING part. Otherwise, the script will fail to write the results to results/all/*.results files owing to nonexistence of /results/all subdirectory.

mkdir "results/all/" unless -d "results/all/"; #necessary for first run

Or, just manually create the subdirectory before running train.pl the first time.

-Xiaodan

Personal tools