Converting media files

This module contains wrappers for the ffmpeg command-line utility. The aim is to expose most-often used functionality in a concise and Pythonic way, while still allowing passing extra options unsupported by AVTK to the underlying ffmpeg invocation.

Synopsis:

FFmpeg(inputs, outputs).run()

Inputs and outputs can either be lists (in case of multiple inputs or outputs), or a single item (in case of single input or output).

Input can be either a string (input file path or stream URL), or an instance of Input. Output can either be a string (output file path) or an instance of Output specifying the format, codecs to use and other options.

Examples:

# Convert MP4 to MKV with default codecs and options
FFmpeg('input.mp4', 'output.mkv').run()

# Convert 1 minute of input video starting at 5min mark into 1080p HEVC video
# with 160kbit AAC audio, stripping any subtitles
FFmpeg(
    Input('input.mkv', seek=300, duration=60),
    Output('output.mp4',
        Video('libx265', scale=(-1, 1080), preset='veryfast', crf=20),
        Audio('aac', bit_rate='160k'),
        NoSubtitles
    )
).run()

# Split video into separate video and audio files
FFmpeg(['input.mp4', [
    Output('video.mp4', streams=[H264(), NoAudio]),
    Output('audio.m4a', streams=[AAC(), NoVideo])
]).run()

Stream, format, input and output definitions all take an optional extra parameter - a list of strings that are passed directly to ffmpeg in appropriate places. This allows you to use all functionality of ffmpeg without explicit support in AVTK.

class Duration(val)

Bases: object

Helper class for parsing duration and time values - don’t use it directly.

Supported duration types: timedelta, Decimal, float, int

Examples:

Duration(timedelta(seconds=3.14))
Duration('5.25')
Duration(60.5)
Duration(3600)
class Stream(encoder)

Bases: object

Output stream definition - base class

Don’t use this class directly. Instead use Video, Audio or Subtitle subclass, or one of the convenience helper classes for specific codecs.

class Video(encoder, scale=None, bit_rate=None, frames=None, extra=None)

Bases: avtk.backends.ffmpeg.convert.Stream

Output video stream definition

Parameters
  • encoder (str) – encoder to use

  • scale (tuple) – resize output to specified size (width, height) - optional

  • bit_rate (int or str) – target video bitrate - optional

  • frames (int) – number of frames to output - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

Use avtk.backends.ffmpeg.cap.get_available_encoders() to get information on supported encoders. You can also use None to disable the video stream (see NoVideo), or copy to copy the video stream without re-encoding (see CopyVideo).

If scale is set, either width or height may be -1 to use the optimal size for preserving aspect ratio, or 0 to keep the original size.

Bitrates should be specified as integers or as strings in ‘NUMk’ or ‘NUMm’ format.

Examples:

# Encode to 480p H.264 at 2mbps, keeping aspect ratio
Video('libx264', scale=(-1:480), bit_rate='2m')

# Convert only one frame to PNG
Video('png', frames=1)

# Passthrough (copy video stream from input to output)
Video('copy')
# or
CopyVideo

# Disable (ignore) video streams
NoVideo
class H264(preset=None, crf=None, **kwargs)

Bases: avtk.backends.ffmpeg.convert.Video

Video stream using H.264 (AVC) codec and libx264 encoder

Parameters
  • preset (str) – encoder preset (quality / encoding speed tradeoff) - optional

  • crf (str) – constant rate factor (determines target quality) - optional

  • scale (tuple) – resize output to specified size (width, height) - optional

  • bit_rate (int or str) – target video bitrate - optional

  • frames (int) – number of frames to output - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

See https://trac.ffmpeg.org/wiki/Encode/H.264 for tips on H.264 encoding with ffmpeg, and for description of CRF and preset parameters.

Examples:

# Encode to high quality 1080p, taking as much time as needed
H264(preset='veryslow', crf=18)
class H265(preset=None, crf=None, **kwargs)

Bases: avtk.backends.ffmpeg.convert.Video

Video stream using H.265 (HEVC) codec and libx265 encoder

Parameters
  • preset (str) – encoder preset (quality / encoding speed tradeoff) - optional

  • crf (str) – constant rate factor (determines target quality) - optional

  • scale (tuple) – resize output to specified size (width, height) - optional

  • bit_rate (int or str) – target video bitrate - optional

  • frames (int) – number of frames to output - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

Uses a single-pass constant quality encode process for HEVC. See https://trac.ffmpeg.org/wiki/Encode/H.265 for tips on HEVC encoding with ffmpeg, and for description of the CRF and preset parameters.

Example:

# Encode to high quality, taking as much time as needed
H265(preset='veryslow', crf=20)
class VP9(crf=None, **kwargs)

Bases: avtk.backends.ffmpeg.convert.Video

Video stream using VP9 codec and libvpx-vp9 encoder

Parameters
  • crf (str) – constant rate factor (determines target quality) - optional, default is 31

  • scale (tuple) – resize output to specified size (width, height) - optional

  • frames (int) – number of frames to output - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

Uses a single-pass constant quality encode process for VP9. See https://trac.ffmpeg.org/wiki/Encode/VP9 for tips on VP9 encoding with ffmpeg, and for description of the CRF parameter.

Example:

# Encode using default parameters (CRF of 31)
VP9()
class AV1(crf=None, **kwargs)

Bases: avtk.backends.ffmpeg.convert.Video

Video stream using AV1 codec and libaom-av1 encoder

Based on AV1 wiki guide: https://trac.ffmpeg.org/wiki/Encode/AV1

Warning

Uses an extremely slow and unoptimized reference AV1 encoder implementation. You probably don’t want to use this.

class Audio(encoder, channels=None, bit_rate=None, extra=None)

Bases: avtk.backends.ffmpeg.convert.Stream

Output audio stream definition

Parameters
  • encoder (str) – encoder to use

  • channels (int) – number of channels to downmix to - optional

  • bit_rate (int or str) – target audio bitrate - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

Use avtk.backends.ffmpeg.cap.get_available_encoders() to get information on supported encoders. You can also use None to disable the audio streams (see NoAudio), or copy to copy the stream without re-encoding (see CopyAudio).

Bitrates should be specified as integers or as strings in ‘NUMk’ format.

Examples:

# Encode to AAC at 256kbps, while downmixing to stereo
Audio('aac', channels=2, bit_rate='256k')

# Passthrough (copy audio stream directly to output)
Audio('copy')
# or
CopyAudio

# Disable (ignore) audio stream
NoAudio
class AAC(**kwargs)

Bases: avtk.backends.ffmpeg.convert.Audio

Audio stream using AAC

Parameters
  • channels (int) – number of channels to downmix to - optional

  • bit_rate (int or str) – target audio bitrate - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

class Opus(**kwargs)

Bases: avtk.backends.ffmpeg.convert.Audio

Audio stream using Opus codec and libopus encoder

Parameters
  • channels (int) – number of channels to downmix to - optional

  • bit_rate (int or str) – target audio bitrate - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the stream - optional

class Subtitle(encoder)

Bases: avtk.backends.ffmpeg.convert.Stream

Output subtitle stream definition

Use avtk.backends.ffmpeg.cap.get_available_encoders() to get information on supported encoders. You can also use None to disable the subtitle streams (see NoSubtitles), or copy to copy the stream without converting (see CopySubtitles).

NoVideo = <avtk.backends.ffmpeg.convert.Video object>

Disable video

CopyVideo = <avtk.backends.ffmpeg.convert.Video object>

Dopy video without re-encoding

NoAudio = <avtk.backends.ffmpeg.convert.Audio object>

Disable audio

CopyAudio = <avtk.backends.ffmpeg.convert.Audio object>

Copy audio without re-encoding

NoSubtitles = <avtk.backends.ffmpeg.convert.Subtitle object>

Disable subtitles

CopySubtitles = <avtk.backends.ffmpeg.convert.Subtitle object>

Copy subtitles

class Input(source, seek=None, duration=None, extra=None)

Bases: object

Define input source

Parameters
  • source (str) – input file path or stream URL

  • seek (see Duration) – seek in source before processing - optional

  • duration (see Duration) – how much of source to process - optional

  • extra (list(str)) – additional ffmpeg command line arguments for the input - optional

Raises

NoMediaError – if source doesn’t exist

Source can either be a local file, capture device or network stream supported by the underlying ffmpeg tool.

Examples:

# Use local file, seek to 2min mark and process 5 minutes
Input('test-media/video/sintel.mkv', seek=120, duration=300)

# Process one minute of an internet radio stream
Input('https://example.radio.fm/stream.aac', duration=60)
class Format(name, extra=None)

Bases: object

Output container format definition

Parameters
  • name (str) – format name

  • extra (list(str)) – additional ffmpeg command line arguments for the input - optional

Use avtk.backends.ffmpeg.cap.get_available_formats() to get information on supported formats.

class MP4(faststart=False)

Bases: avtk.backends.ffmpeg.convert.Format

MP4 output format

Parameters

faststart (bool) – place the moov atom metadata at the beginning - optional, default False

This option is required if you want to support playback of incomplete MP4 file (that is, start playback before the entire file is downloaded).

class WebM

Bases: avtk.backends.ffmpeg.convert.Format

WebM output format

class Matroska

Bases: avtk.backends.ffmpeg.convert.Format

Matroska (MKV) output format

class Ogg

Bases: avtk.backends.ffmpeg.convert.Format

Ogg output format

class Output(target, streams=None, fmt=None, extra=None)

Bases: object

Defines an output to the transcoding process.

Parameters
  • target (str) – output file path

  • streams (Stream or None) – output stream definitions - optional

  • fmt – output format - optional

  • fmtFormat, str or None

  • extra (list(str)) – additional ffmpeg command line arguments for the output - optional

If streams are not specified, all input streams will be transcoded using default codecs for the specified format.

If format is not specified, it is guessed from the output file name. If specified, it can either be a format name or an instance of Format.

class FFmpeg(inputs, outputs)

Bases: object

Convert audio/video

Parameters
  • inputs (Input, str, list(Input) or list(str)) – one or more inputs

  • outputs (Output, str, list(Output) or list(str)) – one or more outputs

Input(s) and output(s) can either be specified as strings representing input and output files respectively, or as Input and Output objects with all the configuration exposed by those classes.

If only one input is used, it can be specified directly. If multiple inputs are used, specify a list of inputs. Likewise for outputs.

Examples:

# Convert input to output
FFmpeg('input.mp4', 'output.mkv').run()

# Combine audio and video
FFmpeg(['video.mp4', 'audio.m4a'], 'output.mkv').run()

# Split audio and video
FFmpeg(['input.mp4', [
    Output('video.mp4', streams=[H264(), NoAudio]),
    Output('audio.m4a', streams=[AAC(), NoVideo])
]).run()
get_args()

Builds ffmpeg command line arguments

Returns

ffmpeg command line arguments for the specified process

Return type

list of strings

Example:

>>> FFmpeg(['input.mp4', [
...     Output('video.mp4', streams=[H264(), NoAudio]),
...     Output('audio.m4a', streams=[AAC(), NoVideo])
... ]).get_args()
[
    '-i', 'input.mp4',
    '-c:v', 'libx264', '-an', 'video.mp4',
    '-c:a', 'aac', '-vn', 'audio.m4a'
]
run(text=True)

Runs the conversion process

Uses get_args() to build the command line and runs it using avtk.backends.ffmpeg.run.ffmpeg().

Parameters

text (bool) – whether to return the output as text - optional, default true

Returns

output (stdout) from ffmpeg invocation

Return type

str if text=True (default), bytes if text=False