hachoir-metadata extracts metadata from multimedia files: music, picture,
video, but also archives. It supports most common file formats:

 * Archives: bzip2, gzip, zip, tar
 * Audio: MPEG audio ("MP3"), WAV, Sun/NeXT audio, Ogg/Vorbis (OGG), MIDI,
   AIFF, AIFC, Real audio (RA)
 * Image: BMP, CUR, EMF, ICO, GIF, JPEG, PCX, PNG, TGA, TIFF, WMF, XCF
 * Misc: Torrent
 * Program: EXE
 * Video: ASF format (WMV video), AVI, Matroska (MKV), Quicktime (MOV),
   Ogg/Theora, Real media (RM)

It tries to give as much information as possible. For some file formats,
it gives more information than libextractor for example, such as the RIFF
parser, which can extract creation date, software used to generate the file,
etc. But hachoir-metadata cannot guess informations. The most complex operation
is just to compute duration of a music using frame size and file size.

hachoir-metadata has three modes:

 * classic mode: extract metadata, you can use --level=LEVEL to limit quantity
   of information to display (and not to extract)
 * --type: show on one line the file format and most important informations
 * --mime: just display file MIME type

The command 'hachoir-metadata --mime' works like 'file --mime',
and 'hachoir-metadata --type' like 'file'. But today file command supports
more file formats then hachoir-metadata.

Website: http://hachoir.org/wiki/hachoir-metadata

Example
=======

Example on AVI video (RIFF file format)::

    $ hachoir-metadata pacte_des_gnous.avi
    Common:
    - Duration: 4 min 25 sec
    - Comment: Has audio/video index (248.9 KB)
    - MIME type: video/x-msvideo
    - Endian: Little endian
    Video stream:
    - Image width: 600
    - Image height: 480
    - Bits/pixel: 24
    - Compression: DivX v4 (fourcc:"divx")
    - Frame rate: 30.0
    Audio stream:
    - Channel: stereo
    - Sample rate: 22.1 KHz
    - Compression: MPEG Layer 3

Modes --mime and --type
=======================

Option --mime ask to just display file MIME type (works like UNIX
"file --mime" program)::

    $ hachoir-metadata --mime logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
    logo-Kubuntu.png: image/png
    sheep_on_drugs.mp3: audio/mpeg
    wormux_32x32_16c.ico: image/x-ico

Option --file display short description of file type (works like
UNIX "file" program)::

    $ hachoir-metadata --type logo-Kubuntu.png sheep_on_drugs.mp3 wormux_32x32_16c.ico
    logo-Kubuntu.png: PNG picture: 331x90x8 (alpha layer)
    sheep_on_drugs.mp3: MPEG v1 layer III, 128.0 Kbit/sec, 44.1 KHz, Joint stereo
    wormux_32x32_16c.ico: Microsoft Windows icon: 16x16x32

What's new in hachoir-metadata 1.0?
===================================

Version 1.0.1
-------------

 * Only use hachoir_core.profiler with --profiler command line option
   so 'profiler' Python module is now optional
 * Set shebang to "#!/usr/bin/python"

Version 1.0
-----------

 * Real audio: read number of channel, bit rate, sample rate and
   compute compression rate
 * JPEG: Read user commment
 * Windows ANI: Read frame rate
 * Use Language from hachoir_core to store language from ID3 and MKV
 * OLE2 and FLV: Extractors are now fault tolerant

What's new in hachoir-metadata 0.10.0?
======================================

hachoir-metadata is now fault tolerant (like hachoir-core and hachoir-parser)!
It is also robust against fuzzing tests.

New supported formats:

 * Microsoft Archive (.mar)
 * Microsoft Office documents: Word (.doc), Excel (.xls), Powerpoint (.ppt)
 * X11 Portable Compiled Font (.pcf)
 * New-style Executable (Windows 16-bits program)

New features:

 * Make a distinction between the raw value and the formated value, so it's
   possible to reuse data. Eg. number of channel raw value is 2 (integer) and
   formated value is "stereo" (string)
 * Add plugin for Nautilus program (of Gnome project)
 * Add plugin for Konqueror program (of KDE project)
 * New API:
   * hasattr(meta, "key") => meta.has("key")
   * meta.key[0] => meta.get("key") or meta.getText("key")
 * "quality" argument to limit extraction complexity (quality=0 is
   the fastest extraction, quality=1.0 is the best but also the slowest)
 * Code is now fault tolerant: don't crash on error but just display error
   message and continue the extraction
 * creation_date and last_modification value type is datetime.date() or
   datetime.datetime() (and not a string)
 * duration type is datetime.timedelta(): precision is now microseconds
   and not milliseconds

Changes:

 * track_number and track_total raw value type is an integer
 * MPEG audio: read music genre and language from ID3v2 and compute
   approximation of bit rate and duration of VBR MP3
 * JPEG: support progressive JPEG (all start of frame types)
 * JPEG (IPTC): Support creation date (keys 55 and 60)
 * Convert more strings to Unicode
 * Reject sample rate smalller than 1 kHz
 * MKV: don't read metadata tags ("SimpleTags") if quality<good
 * ASF: extract album title, track number and create date (year)
 * Archives: max number of processed files depends on quality argument
 * Compute compression rate of archive, audio and picture documents (when we
   have enough informations to do it)
 * Don't compute Ogg duration with quality < 0.5 (default)
 * PNG: Fix bits/pixel computation

What's new in hachoir-metadata 0.9.0?
=====================================

 * Add extractors for EXE program, PSD picture, TTF font and Torrent file
 * Set limits to value to skip invalid metadata (eg. year in 1900..2030)
 * Ogg: compute duration and support "video" chunk
 * Add keys "track_total", "version"

What's new in hachoir-metadata 0.8.2?
=====================================

New features:
 * Truncate very long string (more than 800 characters)
 * setup.py uses distutils by default (and not setuptools) and doesn't depend
   on hachoir-core nor hachoir-parser

Bugfixes:
 * AIFF: skip duration computation if rate is nul (to avoid division by zero)
 * XCF: catch KeyError on bits_per_pixel
 * JPEG (EXIF): only convert exposure to "1/%g" is value is a float
 * WAVE: rewrite extractor. Fix bit rate, fix duration computation, support
   wave with 6 channels and IEEE (32-bit float) codec
 * AVI: avoid division by zero in duration computation
 * Matroska: convert string to Unicode if needed

What's new in hachoir-metadata 0.8.1?
=====================================

 * Fix --version command line option (rename module hachoir to hachoir_core)

What's new in hachoir-metadata 0.8?
===================================

New formats:
 * Aldus Placeable Metafile (APM) picture
 * Audio Interchange File Format (AIFF)
 * Audio Interchange File Format Compressed (AIFC)
 * Microsoft Enhanced Metafile (EMF) picture
 * Microsoft Windows Metafile (WMF) picture
 * Real audio
 * Real media
 * Targa picture

Changes:
 * For string, strip spaces and then skip empty string
 * ICO: use 8 bits/pixel if bpp=0
 * JPEG: format version is "JFIF %u.%02u" (and not "JPEG %s.%s")
 * JPEG: don't compute JPEG quality if needed fields are missing
 * RIFF: compute duration of each stream since audio stream may be
   shorter than video stream


What's new in hachoir-metadata 0.7?
===================================

Important changes:
 * New extractors: Ogg/Vorbis and Ogg/Theora
 * JPEG: compute JPEG quality
 * Matroska: extract subtitle info, support multiple audio and video streams,
   read audio codec, read audio/video title and language
 * Audio: extract bits/sample for audio
 * GIF: read comments and pixel format
 * RIFF: able to use AVI header, support multiple audio streams
 * ID3: also extract creation date
 * IPTC: recognize more tags (author, country, title)
 * Add --parser-list and --profiler command line options

Small changes:
 * Strip spaces (space, tab, new line) for all strings
   keep the longer value
 * Remove duplicate string and also if you have "verlongtext" and "verylo",
 * Add warning when a tag is skipped (for ID3, etc.)
 * Support invalid unicode filenames
 * Fix all divison to avoid divison by zero
 * TAR and ZIP archives: only process first 5 files
 * EXIF: ignore image size if we already know the size since EXIF size is not
   updated when an image is resized
 * Bitmap: read format version

Developer changes:
 * Use array() to simply code
 * Convert raw string to unicode string using charset ISO-8859-1
 * Add get() method to Metadata class
 * Group name can be "name[]": replaced by name[0], name[1], ...

Similar projects
================

 * Kaa - http://freevo.sourceforge.net/cgi-bin/freevo-2.0/Kaa (written in Python)
 * libextractor: http://gnunet.org/libextractor/ (written in C)

A *lot* of other libraries are written to read and/or write metadata in MP3
music and/or EXIF photo.

