Julian is a speech recognition software based on definite finite state automaton grammar. Instead of word N-gram, Julian uses a task grammar as a language model. A task grammar is a set of rule and patterns of acceptable words or word sequences. Unlike Julius, which uses statistical word N-gram as language model, Julian uses hand-written (or auto-generated) task grammar as a lingistic constraint. As the allowed hypotheses are strictly defined by the grammar, it is efficient for recognition system of small vocabulary (i.e. voice command, isolated word recognition, spoken dialogue system of small task).
Julian is derived from Julius to directly drive the grammar constraint, and most codes are shared with Julius. Actually, Julian can be compiled from the source code of Julius by simply specifying "--enable-julian
" to configure
. Since major speech recognition techniques in Julius are incorporated to Julian, it can also performs very well. For example, it can execute recognition of over a thousand words in real time with less than PentiumIII 300MHz machine.
Julian was once a product of Continuous Speech Recognition Consortium, Japan, and has been distributed only for the members of CSRC. Since the consortium has been successfully finished three years activity, now it becomes available for free from rev.3.4.
The archive of Julius-3.4 also includes Julian and several grammar construction tool. Please see the documents below about their usage.
ABOUT LICENSE: the original license term is in Japanese, but we summarized the license term as below for convenience and quick understanding. Please consult the original Japanese LICENSE
file in the source archive for precise details.
Generally, the license of Julius is similar to that of BSD license. There are NO obligation to make your source code free like GPL, and NO restriction on its usage, even for a commercial purpose. Re-distribution and modification of all or part of Julius is also permitted, provided that you attach the copyright notice below to your package, along with the original Japanese license document in the package (LICENSE.txt) Copyright (c) 1991-2004 Kyoto University, Japan Copyright (c) 1997-2000 Information Promotion Agency, Japan Copyright (c) 2000-2004 Nara Institute of Science and Technology, Japan
In module mode, the confidence score will be annotated by "CM" attribute in WHYPO tag.
To output the confidence score on module mode, please add "C
" argument to the "-outcode
" option. Below is an example to tell Julius / Julian to output recognized words ("W
"), their LM entries ("L
"), phoneme sequences ("P
"), scores ("S
"), and confidence scores ("C
") to a client on module mode
% julius .... -outcode WLPSC -module
-cmalpha
". This coefficient is used to smooth and compensate the dynamic range of hypotheses likelihoods for computation of word confidence. The default value is 0.05, and smaller value (close to zero) will cause the total distribution of confidence scores to be leveled to the middle (0.5). The performance of confidence scores may varies by this value, and optimizing this value to the target set may improve the scoring accuracy. (However, leaving this value to the default may work well in most cases).
To disable confidence measuring, specify "--disable-cm
" to configure
.
The intra-class word probability, i.e. word appearance probabilities within the belonging class, should be written as an additional field in the word dictionary. The normal word dictionary is written in the following style:
WordName [OutputString] phone1 phone2 ...When using a class N-gram, you should insert the belonging class entry name and the intra-class probability of the word at the beginning. The probability should be written in log10, with the preceding indicator "
@
".
ClassName @IntraClassLogProb WordName [OutputString] phone1 phone2 ...
Table below shows the correspondence between word N-gram and Class N-gram.
Word N-gram | Class N-gram | |
---|---|---|
N-gram file(s) | Word N-gram | Inter-class N-gram |
Dictionary | Word entry | Word entry + Intra-class probability |
--disable-class-ngram
" to configure
.
-record
), adintool
, adinrec
is changed to Microsoft WAVE format (.wav). You can still record in RAW format by specifying "-raw
" on adintool and adinrec.
Julius
, adinrec
, adintool
was changed from 3000 to 2000.
-tailmargin
) has been fixed.
-gprune none
".
<RECOGFAIL>
to <RECOGFAIL/>
.
</RECOGOUT>
when the 2nd search terminate with hypothesis overflow.
-setting
" will output the configuration options and exit.
-version
" to "-setting
"
-hipass
" to "-hifreq
", "-lopass
" to "-lofreq
". The old options are still acceptable.
configure
options:
--enable-julian
--disable-cm
--disable-class-ngram
-cmalpha value
-outcode C
-lv 2000
-setting
-hifreq
-lofreq