SoX(7)                         Sound eXchange_ng                        SoX(7)

NAME
       SoX - Sound eXchange_ng, another Swiss Army knife of audio manipulation

DESCRIPTION
       This manual describes the file formats and audio device types supported
       by SoX; the SoX manual set starts with sox_ng(1).

       Format  types that SoX can determine by a filename extension are listed
       with their names preceded by a dot.  Format types that  are  optionally
       built into SoX are marked `(optional)'.

       Format  types  that  are  handled  by  the external library sndfile are
       marked `(with sndfile)' and format types that can only  be  read  using
       the external program ffmpeg are marked `(with ffmpeg)'

       Formats  for which SoX has internal drivers but that are also supported
       by sndfile or ffmpeg are marked (also with -t sndfile) or (also with -t
       ffmpeg).  This might be useful if you have a  file  that  doesn't  work
       with SoX's built-in readers and writers.

       To  see  if  SoX  has  support  for an optional format or device, enter
       sox_ng -h and look for its name under `AUDIO FILE  FORMATS'  or  `AUDIO
       DEVICE DRIVERS'.

   FORMATS & DEVICE DRIVERS
       .raw (also with -t sndfile), .f32, .f64, .s8, .s16, .s24, .s32, .u8,
       .u16, .u24, .u32, .ul, .al, .lu, .la
              Raw  (headerless) audio files.  For raw, the sample rate and the
              data encoding must be given using command-line  format  options;
              for the other listed types, the sample rate defaults to 8kHz and
              the  data encoding is defined by the given suffix.  Thus f32 and
              f64 indicate files encoded as 32 and 64-bit IEEE-754 single  and
              double  precision  floating point PCM respectively; s8, s16, s24
              and s32 indicate 8, 16, 24 and 32-bit signed integer PCM respec-
              tively; u8, u16, u24 and u32 indicate 8, 16, 24 and  32-bit  un-
              signed   integer   PCM  respectively;  ul  indicates  `<mu>-law'
              (8-bit), al indicates `A-law' (8-bit) and lu and la are  inverse
              bit-order `<mu>-law' and `A-law' respectively.  For all raw for-
              mats, the number of channels defaults to 1.

              Headerless  audio  files on a SPARC computer are likely to be of
              format ul;  on a Mac, they're likely to be u8 but with a  sample
              rate of 11025 or 22050Hz.

              See  .ima  and  .vox  for raw ADPCM formats and .cdda for raw CD
              digital audio.

       .f4, .f8, .s1, .s2, .s3, .s4, .u1, .u2, .u3, .u4, .sb, .sw, .sl, .ub,
       .uw
              Deprecated aliases for .f32, .f64, .s8, .s16, .s24,  .s32,  .u8,
              .u16, .u24, .u32, .s8, .s16, .s32, .u8 and .u16 respectively.

       .3gp, .3gpp (with ffmpeg)
              Third Generation Partnership Project format.

       .3g2, .3gp2, .3gpp2 (with ffmpeg)
              Third Generation Partnership Project 2 format.

       .8svx (also with -t sndfile)
              Amiga 8SVX musical instrument description format.

       .aac (with ffmpeg)
              Advanced Audio Coding format.

       .ac3 (with ffmpeg)
              Audio Codec 3 (Dolby Digital) format.

       .adts (with ffmpeg)
              Audio Data Transport Stream format.

       .aiff, .aif (also with -t sndfile or -t ffmpeg)
              AIFF  files  as  used on old Apple Macs, Apple IIc/IIgs and SGI.
              SoX's AIFF support does not include multiple  audio  chunks  nor
              the  8SVX musical instrument description format.  AIFF files are
              multimedia archives and can  have  multiple  audio  and  picture
              chunks;  you may need a separate archiver to work with them.  On
              MacOS X, AIFF has been superceded by CAF.

       .aiffc, .aifc (also with -t sndfile)
              AIFF-C is based on AIFF but also handles compressed  audio.   It
              can  also  handle little-endian uncompressed linear data that is
              often referred to as sowt encoding.  This  encoding  has  become
              the  defacto format produced by modern Macs as well as iTunes on
              any platform.  AIFF-C files produced by other applications typi-
              cally have the file extension .aif and require  looking  at  its
              header to detect the true format.  sowt, a-law and u-law are the
              only  encodings  that SoX can read and write natively; for other
              compression types like GSM try -t ffmpeg.

              AIFF-C is defined in DAVIC 1.4 Part 9 Annex B.  This  format  is
              referred from ARIB STD-B24, which is specified for Japanese data
              broadcasting.  Private chunks are not supported.

       alsa (optional)
              The  Advanced  Linux  Sound  Architecture device driver supports
              both playing and recording audio.  ALSA is only used  in  Linux-
              based operating systems, though these often support OSS (see be-
              low) as well.  Examples:

                   sox_ng infile -t alsa
                   sox_ng infile -t alsa default
                   sox_ng infile -t alsa plughw:0,0
                   sox_ng -b 16 -t alsa hw:1 outfile


       .amb   Ambisonic  B-Format  is  a specialization of .wav with between 3
              and 16 channels of audio for use with an Ambisonic decoder.  See
              http://www.ambisonia.com/Members/mleese/file-format-for-b-format
              for details.  It is up to you to get the  channels  together  in
              the right order and at the correct amplitude.

       .amr-nb, .amr-wb (both optional, also with -t ffmpeg)
              Adaptive  Multi  Rate Narrow and Wide Band are lossy formats for
              speech used in 3rd generation mobile telephony  and  defined  in
              3GPP TS 26.071 and TS 26.171

              AMR-NB  audio  has  a  fixed sampling rate of 8kHz and AMR-WB of
              16kHz and they support encoding to the following bit rates,  se-
              lected by the -C option:
                             amr-nb                     amr-wb
                           -C     kbit/s              -C     kbit/s
                           0       4.75               0       6.6
                           1       5.15               1       8.85
                           2       5.9                2      12.65
                           3       6.7                3      14.25
                           4       7.4                4      15.85
                           5       7.95               5      18.25
                           6      10.2                6      19.85
                           7      12.2                7      23.05
                                                      8      23.85

       ao (optional)
              Xiph.org's Audio Output device driver only works for playing au-
              dio.  It supports a wide range of devices and sound systems; see
              its  documentation for the full range.  For the most part, SoX's
              use of libao cannot be configured directly; instead, libao  con-
              figuration files must be used.

              The  filename is used to determine which libao plugin to use and
              normally, you should specify `default'.  If  that  doesn't  give
              the desired behavior, you can specify the short name for a given
              plugin (such as pulse for the PulseAudio plugin).  Examples:

                   sox_ng infile -t ao
                   sox_ng infile -t ao default
                   sox_ng infile -t ao pulse


       .ape (with ffmpeg)
              Monkey's Audio format.

       .apm (with ffmpeg)
              Ubisoft Rayman 2 APM format.

       .aptx (with ffmpeg)
              Audio Processing Technology for Bluetooth format.

              SoX  can only autodetect this type of file from its filename ex-
              tension; if it is read from `standard input' (stdin) or  from  a
              file whose name does not end in `.aptx', you will need to prefix
              it with `-t ffmpeg'.

       .argo_asf (with ffmpeg)
              Argonaut Games ASF format.

       .asf (with ffmpeg)
              Advanced / Active Streaming Format.

       .ast (with ffmpeg)
              AST Audio Stream format.

       .au, .snd (also with -t sndfile or -t ffmpeg)
              Sun Microsystems AU files.  There are many types of AU file; DEC
              has  invented its own with a different magic number and byte or-
              der.  To write a DEC file, use  the  -L  (little-endian)  output
              file option.

              Some  .au  files are known to have invalid AU headers; these are
              probably original Sun <mu>-law 8000 Hz files and  can  be  dealt
              with using the .ul format.

              It  is  possible to override AU file header information with the
              -r (sampling rate) and -c (number of channels) options, in which
              case SoX will issue a warning about the mismatch.

       .avi (with ffmpeg)
              Audio Video Interleaved format.

       .avr (also with -t ffmpeg)
              Audio Visual Research format, used by  a  number  of  commercial
              packages on the Mac.

       .caf (with sndfile, also with -t ffmpeg)
              Apple's Core Audio File format.

       .cdda, .cdr
              `Red Book' Compact Disc Digital Audio (raw audio).  CDDA has two
              audio channels formatted as 16-bit big-endian signed integers at
              a sample rate of 44.1 kHz.  The number of stereo samples in each
              CDDA track is always a multiple of 588.

       coreaudio (optional)
              The  MacOS  X  CoreAudio device driver supports both playing and
              recording.  If a filename is not specific or if the name is "de-
              fault", the default audio device is selected.   Any  other  name
              will  be  used to select a specific device.  The valid names can
              be seen in the System Preferences->Sound menu and then under the
              Output and Input tabs.

              Examples:

                   sox_ng infile -t coreaudio
                   sox_ng infile -t coreaudio default
                   sox_ng infile -t coreaudio "Internal Speakers"


       .cvsd, .cvs
              Continuously Variable Slope Delta  modulation  is  a  headerless
              format  used  to  compress speech audio for applications such as
              voice mail with a fixed bit rate of 8kHz.  This format is  some-
              times  used with bit-reversed samples; the -X option can be used
              to set the bit order.

       .cvu   Unfiltered Continuously Variable Slope Delta  modulation  is  an
              alternative  handler for CVSD that is unfiltered but can be used
              with any sampling rate. As it is a headerless format,  you  have
              to  specify  the  sampling  rate with -r if it is different from
              8kHz.

                   sox_ng infile outfile.cvu rate 28k
                   play -r 28k outfile.cvu sinc -3.4k


       .dat   Text Data files contain a textual representation of sample data.
              There is one line at the beginning that contains the sample rate
              and one that contains the number of channels.  Subsequent  lines
              contain  two  or more numeric data items: the time since the be-
              ginning of the first sample and the sample value for each  chan-
              nel.

              Values  are  normalized so the maximum and minimum are 1 and -1.
              This file format can be used to create data files  for  external
              programs  such as FFT analyzers or graph routines.  SoX can also
              convert a file in this format back into one of  the  other  for-
              mats.

              Example containing only 2 stereo samples of silence:


                  ; Sample Rate 8012
                  ; Channels 2
                              0    0    0
                  0.00012481278  0    0


       .dfpwm (with ffmpeg)
              DFPWM1a format.

              SoX  can only autodetect this type of file from its filename ex-
              tension; if it is read from `standard input' (stdin) or  from  a
              file  whose name does not end in `.dfpwm', you will need to pre-
              fix it with `-t ffmpeg'.

       .dts (with ffmpeg)
              Digital Theatre Systems format.

              SoX can only autodetect this type of file from its filename  ex-
              tension;  if  it is read from `standard input' (stdin) or from a
              file whose name does not end in `.dts', you will need to  prefix
              it with `-t ffmpeg'.

       .dff   Direct Stream Digital Interchange File Format (DSDIFF) is a for-
              mat  defined by Philips for storing 1-bit DSD data, used in SACD
              mastering and occasionally for online distribution.

       .dsf, .wsd
              DSD Stream File is a format defined by Sony  for  storing  1-bit
              DSD  data,  commonly  used for online distribution of audiophile
              recordings.

       .dvms, .vms
              The Digital Voice Messaging System format is used in Germany  to
              compress  speech  audio for voice mail.  It is a self-describing
              variant of cvsd.

       .eac3 (with ffmpeg)
              Enhanced AC-3 Audio.

       .f4v (with ffmpeg)
              Another name for .mov.

       .fap (with sndfile)
              See .paf.

       ffmpeg (optional)
              This is a pseudo-type that uses the external program  ffmpeg  if
              it  is  installed.  It  can only read files, not write them, and
              will extract the sound track from many video file formats.  ffm-
              peg deduces the actual file type from the file's contents with a
              far more advanced algorithm than that used by  SoX,  which  only
              recognizes up to two fixed byte sequences at fixed offsets.

       .flac (optional; also with -t sndfile or -t ffmpeg)
              Xiph.org's  Free Lossless Audio Codec compressed audio.  FLAC is
              an open, patent-free codec designed for compressing  music.   It
              is  similar  to MP3 and Ogg Vorbis but lossless, so the audio is
              compressed without any loss in quality.

              SoX can read native FLAC files (.flac) but  can  only  read  Ogg
              FLAC files (.oga) if ffmpeg is installed.

              See  .ogg below for information relating to support for Ogg Vor-
              bis files.

              SoX can write native FLAC files according to a given or  default
              compression level.  8 is the default compression level and gives
              the  best  (but  slowest)  compression;  0  gives the least (but
              fastest) compression.  The compression level is  selected  using
              the -C option (see sox_ng(1)) with a whole number from 0 to 8.

       .flv (with ffmpeg)
              Macromedia Flash Video format.

       .fssd  Flexible Sound Studio Data format, a raw format that defaults to
              .u8 at 8kHz.

       .gsrt  Grandstream  ring-tone  files.  Whilst this file format can con-
              tain A-Law, <mu>-law, GSM, G.722, G.723, G.726, G.728,  or  iLBC
              encoded  audio,  SoX supports reading and writing only A-Law and
              <mu>-law.  E.g.

                 sox_ng music.wav -t gsrt ring.bin
                 play ring.bin


       .gsm (optional; also with -t sndfile or -t ffmpeg))
              GSM 06.10 Lossy Speech Compression.  A  lossy  format  for  com-
              pressing  speech which is used in the Global Standard for Mobile
              telecommunications (GSM).  It's good for its purpose,  shrinking
              audio data size, but it will introduce lots of noise when an au-
              dio  signal  is encoded and decoded multiple times.  This format
              is used by some voice mail applications and is rather CPU inten-
              sive.

       .gxf (with ffmpeg)
              General eXchange Format.

       .hcom (also with -t ffmpeg)
              Macintosh HCOM files.  These are Mac  FSSD  files  with  Huffman
              compression.

       .htk (also with -t sndfile)
              Single  channel  16-bit  PCM  format  used by HTK, a toolkit for
              building Hidden Markov Model speech processing tools.

       .ircam (also with -t sndfile or -t ffmpeg)
              Another name for .sf.

       .ima (also with -t sndfile)
              A headerless file of IMA ADPCM  audio  data.  IMA  ADPCM  claims
              16-bit  precision packed into only 4 bits, but in fact sounds no
              better than .vox.

       .ism (with ffmpeg)
              ISM streaming video format.

       .kvag (with ffmpeg)
              Simon & Schuster Interactive VAG format.

       .lpc, .lpc10
              LPC-10 is a compression  scheme  for  speech  developed  by  the
              United      States      Department      of     Defense.      See
              https://github.com/jafingerhut/lpc10 for details.  There  is  no
              associated file format, so SoX's implementation is headerless.

       .m4a (with ffmpeg)
              MPEG-4 Audio format.

       .m4b (with ffmpeg)
              Another name for .mov.

       .m4v, .mp4 (with ffmpeg)
              MPEG-4 Video format.

       .mat, .mat4, .mat5 (with sndfile)
              Matlab  4.2/5.0  (respectively GNU Octave 2.0/2.1) format.  .mat
              is the same as .mat4.

       .m3u   A playlist format, containing a list of audio  files.   SoX  can
              read  but  not  write  this file format.  See [1] for details of
              this format.

       .maud  An IFF-conforming audio file type registered by  MS  MacroSystem
              Computer  GmbH and published along with the `Toccata' sound card
              on the  Amiga  allows  8bit  linear,  16bit  linear,  A-Law  and
              <mu>-law in mono and stereo.

       .mj2 (with ffmpeg)
              Another name for .mov.

       .mkv, .webm (with ffmpeg)
              Matroska video format.

       .mlp (with ffmpeg)
              Meridian Lossless Packing format.

       .mov (with ffmpeg)
              MPEG-1 Systems / MPEG program stream format.

       .mp2 (optional, also with -t sndfile or -t ffmpeg)
              MP2 and MP3 compressed audio (MPEG 1 Layers 2 and 3) are part of
              the MPEG standards for audio and video compression whose patents
              have  expired.   They are lossy compression formats that achieve
              good compression rates with little quality loss.

              libmad, which SoX uses to decode .mp2 files, does  not  work  on
              files  with  a  bit-rate  higher than 192K but you can read them
              with -t sndfile or -t ffmpeg.

              SoX can only autodetect MP2 files from their filename extension;
              if they are read from `standard input' (stdin) or  from  a  file
              whose name does not end in `.mp2', prefix them with `-t mp2'.

              When  writing  MP2  files,  SoX  uses  twolame  but does not use
              twolame's Variable Bit Rate (VBR) encoding  yet,  only  Constant
              Bit  Rate  (CBR).   The  bit  rate is set using the -C option in
              kbps: 32, 48, 56, 64, 80, 96, 112, 128,  160  or  192  for  mono
              files  and  double  those  for stereo ones.  At present, twolame
              outputs 0.005 seconds of quiet garbage at the  start  and  trun-
              cates the end by 0.002 seconds.

       .mp3 (optional, also with -t sndfile or -t ffmpeg)
              SoX  uses libmad to decode MP3 files; to decode using libmpg123,
              which generally gives better quality results and  is  better  at
              decoding damaged or corrupt files, use -t sndfile, while -t ffm-
              peg uses yet another MP2/3 decoder, internal to ffmpeg.

              When reading MP3 files, up to 28 bits of precision is stored al-
              though  only  16  bits are returned. This is to give the default
              behavior of writing 16-bit output files but you  can  specify  a
              higher precision for the output file to prevent loss of this ex-
              tra information.

              When  writing  MP3  files, SoX uses liblame and can use up to 24
              bits of precision.  At present SoX's use of  liblame  adds  0.01
              seconds  of  quiet garbage at the start and at the end; encoding
              with -t sndfile gets the length right and no garbage.

              MP3 compression parameters can be selected using SoX's -C option
              as follows:

              The primary parameter to the LAME MP3 encoder is the  bit  rate.
              If  the  value of the -C value is a positive integer, it's taken
              as the bitrate in kbps (e.g. if you specify  128,  it  uses  128
              kbps).

              The  second  most  important parameter is "quality" which allows
              balancing encoding speed vs.  quality.   In  LAME,  0  specifies
              highest  quality but is very slow, while 9 selects poor quality,
              but is fast. (5 is the default and 2 is recommended  as  a  good
              trade-off for high quality encodes.)

              Because  the -C value is a float, the fractional part is used to
              select quality. 128.2 selects 128 kbps encoding with  a  quality
              of  2.  There  is one problem with this approach. We need 128 to
              specify 128 kbps encoding with default quality, so 0  means  use
              default.  Instead  of  0 you have to use .01 (or .99) to specify
              the highest or lowest quality (128.01 or 128.99).

              LAME uses bitrate to specify a constant bitrate but higher qual-
              ity can be achieved using Variable Bit Rate (VBR).  VBR  quality
              (really  size)  is  selected  using  a number from 0 to 9. Use a
              value of 0 for high quality, larger  files  and  9  for  smaller
              files of lower quality. 4 is the default.

              In  order  to squeeze the selection of VBR into the the -C value
              we use negative numbers to select VBR. -4.2 would select default
              VBR encoding (size) with high quality (speed). One special  case
              is  0,  which  is a valid VBR encoding parameter but not a valid
              bitrate.  Compression value of 0 is always  treated  as  a  high
              quality  VBR, as a result both -0.2 and 0.2 are treated as high-
              est quality VBR (size) and M M high quality (speed).

              See Ogg Vorbis and opus for similar formats that achieve  higher
              signal quality with less bandwidth.

       .mp4 (with ffmpeg)
              MPEG-4 video format.

       .mpeg, .mpg (with ffmpeg)
              MPEG-1 Systems / MPEG program stream format.

       .mpegts (with ffmpeg)
              MPEG-TS (MPEG-2 Transport Stream) format.

       .mxf, .mxf_opatom (with ffmpeg)
              Material eXchange Format Operational Pattern OP1A "OP-Atom" for-
              mat (SMPTE 390M).

       .nist (also with -t sndfile or -t ffmpeg)
              See .sph.

       .nsp (also with -t ffmpeg)
              SoX  can read Computerized Speech Lab NSP files that may contain
              both audio and bioelectric data.  Typically, the  first  channel
              is  sound pressure (audio) and additional channels are data such
              as laryngeal kinematic  or  aerodynamic  (air  pressure  or  air
              flow).

              The  NSP  file  format  was  also used for the Phonetic Database
              (PDB) from Speech Technology Research who had a free NSP Player,
              SpeakNSP.  CSL NSP file reading and writing is supported by  the
              WaveSurfer package.

       .nut (with ffmpeg)
              NUT  is  a low overhead generic container format that stores au-
              dio, video, subtitle and user-defined streams in  a  simple  yet
              efficient way.

       .oga (with ffmpeg)
              Various Xiph.org audio formats in an Ogg container.

       .ogg, .vorbis (optional, also with -t sndfile or -t ffmpeg))
              Xiph.org's  Ogg  Vorbis  compressed  audio; an open, patent-free
              codec designed for music and streaming audio.   It  is  a  lossy
              compression  format  (similar to MP3 and AAC) that achieves good
              compression rates with a minimal amount of quality loss.

              SoX can decode all types of Ogg Vorbis files and can  encode  at
              different compression levels/qualities given as a number from -1
              (highest  compression/lowest quality) to 10 (lowest compression,
              highest quality).  By default the encoding quality  level  is  3
              (which gives an encoded rate of approx. 112kbps) but this can be
              changed  using  the -C option with a number from -1 to 10; frac-
              tional numbers (e.g.  3.6) are also allowed.  Decoding is  some-
              what CPU intensive and encoding is very CPU intensive.

              See .mp3 for a similar format.

       .opus (optional)
              Xiph.org's  Opus compressed audio is an open, lossy, low-latency
              codec offering a wide range of compression rates  and  uses  the
              Ogg container.

              SoX can only read Opus files, not write them.

       oss (optional)
              The Open Sound System /dev/dsp device driver supports both play-
              ing  and recording audio.  OSS support is available in Unix-like
              operating systems, sometimes  together  with  alternative  sound
              systems (such as ALSA).  Examples:

                   sox_ng infile -t oss
                   sox_ng infile -t oss /dev/dsp
                   sox_ng -b 16 -t oss /dev/dsp outfile


       .paf, .fap (with sndfile, also with -t ffmpeg)
              Ensoniq PARIS file format (big and little-endian respectively).

       .pls   A  playlist  format  containing  a list of audio files.  SoX can
              read, but not write this file format.  See [2]  for  details  of
              this format.

              Note:  SoX  support  for  SHOUTcast PLS relies on wget(1) and is
              only partially supported: it's necessary to  specify  the  audio
              type manually, e.g.

                   play -t mp3 "http://a.server/pls?rn=265&file=filename.pls"

              and  SoX  does  not  know about alternative servers - hit Ctrl-C
              twice in quick succession to quit.

       .prc   Psion Record are used in Psion EPOC PDAs  (Series  5,  Revo  and
              similar)  for  System alarms and recordings made by the built-in
              Record application.  When writing, SoX defaults to A-law,  which
              is  recommended;  if  you  must  use ADPCM, use the -e ima-adpcm
              switch. The sound quality is poor because Psion Record seems  to
              insist  on  frames  of  800  samples or fewer, so that the ADPCM
              CODEC has to be reset at every  800  frames,  which  causes  the
              sound to glitch every tenth of a second.

       pulseaudio (optional)
              PulseAudio  is  a  cross-platform  networked  sound server.  The
              PulseAudio driver supports both playing and recording of  audio.
              If  a  file  name  is specified with this driver, it is ignored.
              Examples:

       .pvf (with sndfile)
              Portable Voice Format.

       .ra (with ffmpeg)
              RealAudio format.

       raw    Headerless audio data. See the first entry in this list for  de-
              tails.

       .rm (with ffmpeg)
              RealMedia format.

       .rso (with ffmpeg)
              Lego Mindstorms RSO format.

              SoX  can only autodetect this type of file from its filename ex-
              tension; if it is read from `standard input' (stdin) or  from  a
              file  whose name does not end in `.rso', you will need to prefix
              it with `-t ffmpeg'.

       .sbc (with ffmpeg)
              Bluetooth SIG low-complexity subband audio format.

              SoX can only autodetect this type of file from its filename  ex-
              tension;  if  it is read from `standard input' (stdin) or from a
              file whose name does not end in `.sbc', you will need to  prefix
              it with `-t ffmpeg'.

       .sd2 (with sndfile)
              Sound Designer 2 format.

       .sds (with sndfile, also with -t ffmpeg)
              MIDI Sample Dump Standard.

       .sf (also with -t sndfile or -t ffmpeg)
              IRCAM   SDIF  (Institut  de  Recherche  et  Coordination  Acous-
              tique/Musique Sound Description Interchange Format) is  used  by
              academic  music  software  such  as  the  CSound package and the
              MixView sound sample editor.

       .sln (also with -t ffmpeg)
              Asterisk PBX `signed linear' 8khz, 16-bit signed  integer,  lit-
              tle-endian raw format.

       .smjpeg (with ffmpeg)
              Loki SDL MJPEG.

       .smp   SMP  files  are  for use with the PC-DOS package SampleVision by
              Turtle Beach Softworks, which  communicates  with  several  MIDI
              samplers.   All  sample  rates  are supported by the package al-
              though not all are supported by the samplers  themselves.   Loop
              points are currently ignored.

       .snd   Several file formats use the .snd extension.

              The  main one was by NeXT, essentially the same as Sun Microsys-
              tems' .au format. See .au

              Apple made another .snd format in which the first two bytes  are
              a  16-bit  integer representing the numbers 1 or 2 but which can
              often be read as a raw format.

              Akai had an audio file format for its MPC range of  samplers  of
              which  the  first  byte contains the number 1 and the second the
              number 4. See .mpc2k

              There are also Sounder and SoundTool files  from  MS-DOS/Windows
              in the early '90s.  See .sndr and .sndt.

              Lastly,  the  HOM-BOT  Robot Vacuum Cleaner and the V.Flash Home
              Entertainment System use .snd audio files which are raw  single-
              channel  16-bit  16kHz PCM and the Unity Game Engine uses a com-
              pressed format called .snd.

       sndfile (optional)
              This is a pseudo-type that forces libsndfile to  be  used.   For
              writing  files,  the  actual  file type is taken from the output
              file name; for reading them, it is deduced from the file.

       sndio (optional)
              The OpenBSD  audio  device  driver  supports  both  playing  and
              recording audio.

       .sndr  Sounder  files  are an MS-DOS/Windows format from the early '90s
              that usually have the extension `.snd'.

       .sndt  SoundTool files are another MS-DOS/Windows format from the early
              '90s that usually have the extension `.snd'.

       .sou   An alias for the .u8 raw format.

       .sox (also with -t ffmpeg)
              SoX's native uncompressed PCM format is intended for storing  or
              piping audio at intermediate processing points between SoX invo-
              cations.   It  has  much  in common with WAV, AIFF and AU uncom-
              pressed PCM formats but has the following specific  characteris-
              tics:  the PCM samples are stored as 32 bit signed integers, the
              samples are stored (by default) as `native endian' and the  num-
              ber of samples in the file is recorded as a 64-bit integer. Com-
              ments are also supported.

              See the section `Special Filenames' in sox_ng(1) for examples of
              using the .sox format with pipes.

       .spdif (with ffmpeg)
              IEC 61937 S/PDIF format.

              SoX  can only autodetect this type of file from its filename ex-
              tension; if it is read from `standard input' (stdin) or  from  a
              file  whose name does not end in `.spdif', you will need to pre-
              fix it with `-t ffmpeg'.

       .sph, .nist (also with -t sndfile or -t ffmpeg)
              SPHERE (SPeech HEader REsources) is a  file  format  defined  by
              NIST  (National  Institute  of  Standards and Technology) and is
              used with speech audio.  SoX can read these files when they con-
              tain <mu>-law and PCM data.  It will ignore  header  information
              that  says  the data is compressed using shorten compression and
              will treat the data as either <mu>-law or PCM.  SoX and the com-
              mand line shorten program can be run together using pipes to en-
              compasses the data and then pass the result to SoX for  process-
              ing.

       .spx, .speex (with ffmpeg)
              Ogg  Speex format is for high compression of speech that, in VBR
              mode, achieves higher quality than AMR or GSM, but is  now  con-
              sidered superceded by their more recent Opus codec.

       sunau (optional)
              The  Sun  /dev/audio  device  driver  supports  both playing and
              recording audio.  For example:

                   sox_ng infile -t sunau /dev/audio

              or

                   sox_ng infile -t sunau -e mu-law -c 1 /dev/audio

              for older Sun equipment.


       .svcd (with ffmpeg)
              Another name for .mov.

       .tta (with ffmpeg)
              True Audio format.

       .vag (with ffmpeg)
              Sony PS2 VAG format.

       .txw   TXW is a file format from the Yamaha  TX-16W  sampling  keyboard
              which  wrote  samples onto IBM/PC-format 3.5" floppies at sample
              rates of 16kHz, 33kHz and 50kHz, all exact divisors  of  100kHz.
              When  decoded  to WAV files or other formats that only represent
              integer sample rates, it is declared as 16667 and 33333  but  in
              WAV files the byte rate field is wrong at 66667.

              SoX  handles  reading of files which do not have the sample rate
              field set to one of the expected rates by looking at some  other
              bytes  in  the attack/loop length fields and defaulting to 33kHz
              if the sample rate is still unknown.

       .vcd (with ffmpeg)
              Another name for .mov.

       .vms   See .dvms.

       .vob (with ffmpeg)
              Another name for .mov.

       .voc (also with -t sndfile or -t ffmpeg)
              Sound Blaster VOC  files  are  multi-part  and  contain  silence
              parts,  looping and different sample rates for different chunks.
              On input, the silence parts are filled out, loops are  rejected,
              and  sample  data  with  a new sample rate is rejected.  Silence
              with a different sample rate  is  generated  appropriately.   On
              output,  silence  is  not  detected,  nor  are impossible sample
              rates.  SoX reads but  cannot  write  VOC  files  with  multiple
              blocks  and files containing <mu>-law, A-law and 2/3/4-bit ADPCM
              samples.

       .vorbis
              See .ogg.

       .vox   Headerless files of Dialogic/OKI ADPCM audio data commonly  come
              with  the  extension .vox.  This ADPCM data has 12-bit precision
              packed into only 4-bits.

              Note: some early Dialogic hardware does not always reset the AD-
              PCM encoder at the start of each vox file.  This can  result  in
              clipping and/or DC offset problems when it comes to decoding the
              audio.  While little can be done about the clipping, a DC offset
              can  be removed by passing the decoded audio through a high-pass
              filter, e.g.:

                   sox_ng input.vox output.wav highpass 10


       .w64 (with sndfile, also with -t ffmpeg)
              Sonic Foundry's 64-bit RIFF/WAV format.

              SoX can only autodetect this type of file from its filename  ex-
              tension;  if  it is read from `standard input' (stdin) or from a
              file whose name does not end in `.w64', you will need to  prefix
              it with `-t w64'.

       .wav (also with -t sndfile or -t ffmpeg)
              Microsoft  .WAV  RIFF  files are the native audio file format of
              Windows and widely used for uncompressed audio.

              Normally .wav files have all  formatting  information  in  their
              headers,  so  format options do not usually need to be specified
              for input files.  If any are, they override the file header  and
              you  will  be warned to this effect.  Output format options will
              cause a format conversion and the .wav is written appropriately.

              SoX can read and write linear PCM, floating point, <mu>-law,  A-
              law, MS ADPCM and IMA (or DVI) ADPCM-encoded samples.  WAV files
              can  also contain audio encoded in other ways not currently sup-
              ported with SoX (e.g. MP3); in some cases such a file can  still
              be read by SoX by overriding the file type, e.g.

                 play -t mp3 mp3-encoded.wav


              Natively, SoX can only read WAV files with a bit-depth of 8, 16,
              24  or  32; files with other bit-depths can be read by preceding
              them with -t sndfile.

              Big endian versions of RIFF files, called RIFX,  are  also  sup-
              ported.  To write a RIFX file, use the -B output file option.

              See also .wavpcm.

       waveaudio (optional)
              The MS-Windows native audio device driver.  Examples:

                   sox_ng infile -t waveaudio
                   sox_ng infile -t waveaudio default
                   sox_ng infile -t waveaudio 1
                   sox_ng infile -t waveaudio "High Definition Audio Device"

              If  the device name is omitted, -1, or default, you get the `Mi-
              crosoft Wave Mapper' device.  Wave Mapper means `use the  system
              default  audio devices' and you can control what `default' means
              via the OS Control Panel.

              If the given device name is some other number, you get that  au-
              dio  device  by its index, so recording with device name 0 would
              get the first input device (perhaps the microphone), 1 would get
              the second (perhaps line in), etc.  Playback using device name 0
              will get the first output device (usually  the  only  audio  de-
              vice).

              If  the  given device name is something other than a number, SoX
              tries to match it (to a maximum of 31  characters)  against  the
              names of the available devices.


       .wavpcm
              A  non-standard  but widely used variant of .wav.  Some applica-
              tions cannot read a standard WAV  file  header  for  PCM-encoded
              data  with  a sample size greater than 16 bits or with more than
              two channels but can read a  non-standard  WAV  header.   It  is
              likely that such applications will eventually be updated to sup-
              port  the standard header but, in the mean time, this SoX format
              can be used to create files with the  non-standard  header  that
              should work with these applications.  SoX will automatically de-
              tect and read WAV files with a non-standard header.

              The  most common use of this file type is likely to be along the
              following lines:

                   sox_ng infile.any -t wavpcm -e signed-integer outfile.wav


       .webm (with ffmpeg)
              See .mkv.

       .wma (with ffmpeg)
              Windows Media Audio format.

       .wsaud (with ffmpeg)
              Westwood Studios audio format.

       .wsd   Wideband Single-bit Data is the same as .dsf but with a  differ-
              ent header.

       .wtv (with ffmpeg)
              Windows Television format.

       .wv (also with -t sndfile or -t ffmpeg)
              WavPack  lossless audio compression.  Note that, when converting
              .wav to this format and back again, the RIFF header is not  nec-
              essarily preserved losslessly, though the audio is.

       .wve (also with -t sndfile)
              Psion 8-bit A-law is used on Psion SIBO PDAs (Series 3 and simi-
              lar).

       .xa (also with -t ffmpeg)
              Maxis XA files are 16-bit ADPCM audio files used by Maxis games.
              Writing  .xa  files  is currently not supported, although adding
              write support should not be very difficult.

       .xi (with sndfile)
              Fasttracker 2 Extended Instrument format.

SEE ALSO
       sox_ng(1), soxi_ng(1).

       The SoX web site at https://codeberg.org/sox_ng

   References
       [1]    Wikipedia, M3U, http://en.wikipedia.org/wiki/M3U

       [2]    Wikipedia, PLS, http://en.wikipedia.org/wiki/PLS_(file_format)

AUTHORS
       Lance Norskog, Chris Bagwell and many other  authors  and  contributors
       listed in the README file that is distributed with the source code.

soxformat_ng                   November 28, 2024                        SoX(7)
