Skip to content

Command line interface (CLI) reference

This page contains the full reference documentation for each command in the CLI. See also Command line interface (CLI) user guide for guidelines on using the CLI.

The ReadAlongs CLI has five key commands:

Each command can be run with -h or --help to display its usage manual, e.g., readalongs -h, readalongs align --help.

readalongs align

Align TEXTFILE and AUDIOFILE and create output files as OUTPUT_BASE.* in directory OUTPUT_BASE/.

TEXTFILE: Input text file path (in XML or plain text)

If TEXTFILE has a .xml or .readalong extension or starts with an XML declaration line, it is parsed as XML and can be in one of three formats: - the output of 'readalongs make-xml', - the output of 'readalongs tokenize', or - the output of 'readalongs g2p'.

If TEXTFILE has a .txt extension or does not start with an XML declaration line, it is read as plain text with the following conventions: - The text should be plain UTF-8 text without any markup. - Paragraph breaks are indicated by inserting one blank line. - Page breaks are indicated by inserting two blank lines.

One can add the known ARPABET phonetics in the XML for words ( elements) that are not correctly handled by g2p in the output of 'readalongs tokenize' or 'readalongs g2p', via the ARPABET attribute.

One can add anchor elements in the XML, e.g., '', to mark known anchor points between the audio and text stream.

AUDIOFILE: Input audio file path, in any format supported by ffmpeg

OUTPUT_BASE: Output files will be saved as OUTPUT_BASE/OUTPUT_BASE.*

Usage:

readalongs align [OPTIONS] TEXTFILE AUDIOFILE OUTPUT_BASE

Options:

  -b, --bare                      Bare alignments do not split silences
                                  between words
  -c, --config PATH               Use ReadAlong-Studio configuration file (in
                                  JSON format)
  -o, --output-formats TEXT       Comma- or colon-separated list of additional
                                  output file formats to export to. The text
                                  is always exported as XML and alignments as
                                  SMIL, but one or more of these formats can
                                  be requested in addition:

                                  eaf (ELAN file), html (Single-file, offline
                                  HTML), srt (SRT subtitle), TextGrid (Praat
                                  TextGrid), vtt (WebVTT subtitle), xhtml
                                  (Simple XHTML)
  -f, --force-overwrite           Force overwrite output files
  -l, --language, --languages TEXT
                                  The language code(s) for text in TEXTFILE
                                  (use only with plain text input); multiple
                                  codes can be joined by ',' or ':', or by
                                  repeating the option, to enable the g2p
                                  cascade (run 'readalongs g2p -h' for
                                  details); run 'readalongs langs' to list all
                                  supported languages.
  -m, --align-mode [strict|moderate|loose|auto]
                                  Decoder search parameters: 'strict' means a
                                  narrow beam, fastest but might fail to find
                                  an alignment; 'loose' means an unlimited
                                  beam, slowest, should always succeed but the
                                  alignment is more likely to be wrong;
                                  'moderate' is in between; 'auto' (the
                                  default) means try strict first, and fall
                                  back to moderate then loose if no alignment
                                  is found.
  -s, --save-temps                Save intermediate stages of processing and
                                  temporary files (dictionary, FSG,
                                  tokenization, etc)
  -d, --debug                     Display debugging messages
  --debug-aligner                 Display logs from the aligner
  --debug-g2p                     Display verbose g2p error messages
  --help                          Show this message and exit.

readalongs make-xml

make XMLFILE for 'readalongs align' from PLAINTEXTFILE.

PLAINTEXTFILE must be plain text encoded in UTF-8, with one sentence per line, paragraph breaks marked by a blank line, and page breaks marked by two blank lines.

PLAINTEXTFILE: Path to the plain text input file, or - for stdin

XMLFILE: Path to the XML output file, or - for stdout [default: PLAINTEXTFILE.readalong]

Usage:

readalongs make-xml [OPTIONS] PLAINTEXTFILE [XMLFILE]

Options:

  -d, --debug                     Add debugging messages to logger
  -f, --force-overwrite           Force overwrite output files
  -l, --language, --languages TEXT
                                  The language code(s) for text in
                                  PLAINTEXTFILE; multiple codes can be joined
                                  by ',' or ':', or by repeating the option,
                                  to enable the g2p cascade (run 'readalongs
                                  g2p -h' for details); run 'readalongs langs'
                                  to list all supported languages.  [required]
  --help                          Show this message and exit.

readalongs tokenize

Tokenize XMLFILE for 'readalongs align' into TOKFILE.

XMLFILE should have been produced by 'readalongs make-xml'. TOKFILE can then be augmented with word-specific language codes. 'readalongs align' can be called with either XMLFILE or TOKFILE as XML input.

XMLFILE: Path to the XML file to tokenize, or - for stdin

TOKFILE: Output path for the tok'd XML, or - for stdout [default: XMLFILE.tokenized.readalong]

Usage:

readalongs tokenize [OPTIONS] XMLFILE [TOKFILE]

Options:

  -d, --debug            Add debugging messages to logger
  -f, --force-overwrite  Force overwrite output files
  --help                 Show this message and exit.

readalongs g2p

Apply g2p mappings to TOKFILE into G2PFILE.

TOKFILE should have been produced by 'readalongs tokenize'. G2PFILE can then be modified to adjust the phonetic representation as needed. 'readalongs align' can be called with G2PFILE instead of TOKFILE as XML input.

The g2p cascade will be enabled whenever an XML element or any of its ancestors in TOKFILE has the attribute "fallback-langs" containing a comma- or colon-separated list of language codes. Provide multiple language codes to "readalongs make-xml" via its -l option to generate this attribute globally, or add it manually where needed. Undetermined, "und", is automatically added at the end of the language list provided via -l.

With the g2p cascade, if a word cannot be mapped to valid ARPABET with the language found in the "xml:lang" attribute, the languages in "fallback-langs" are tried in order until a valid ARPABET mapping is generated.

The output XML file can be used as input to align.

TOKFILE: Path to the input tokenized XML file, or - for stdin

G2PFILE: Output path for the g2p'd XML, or - for stdout [default: TOKFILE with .g2p. inserted]

Usage:

readalongs g2p [OPTIONS] TOKFILE [G2PFILE]

Options:

  -f, --force-overwrite  Force overwrite output files
  --debug-g2p            Display verbose messages about g2p errors.
  -d, --debug            Add debugging messages to logger
  --help                 Show this message and exit.

readalongs langs

List all the language codes and names currently supported by g2p that can be used for ReadAlongs creation.

Usage:

readalongs langs [OPTIONS]

Options:

  --help  Show this message and exit.