ISO/IEC 14496 explains MPEG-4, it specifies a system for the communication of interactive audio-visual scenes. This specification includes the following rudiments:
1. The coded depiction of natural or synthetic, two-dimensional (2D) or three-dimensional (3D) objects that can be manifested audibly and/or visually (audio-visual objects.).
2. The coded representation of the spatio-temporal positioning of audio-visual objects as well as their behavior in response to interaction.
3. The information related to management of data streams which are also coded.
4. A generic interface to the data stream delivery layer functionality.
5. An application engine for programmatic control of the player: format, delivery of downloadable Java byte code as well as its execution lifecycle and behavior.
6. A file format to contain the media information of an ISO/IEC 14496 presentation in a flexible, extensible format to facilitate interchange, management, editing, and presentation of the media APIS. The overall operation of a system communicating audio-visual scenes can be paraphrased as follows: The audio-visual scene information is compressed at the sending terminal then supplemented with synchronization information and passed to a delivery layer that multiplexes it into one or more coded binary streams that are transmitted or stored and these streams are demultiplexed and decompressed at the receiving terminal. According to the scene description and synchronization information the audiovisual objects are composed and presented to the end user who also may have the option to interact with this presentation. And such interaction information is processed locally or transmitted back to the sending terminal. Its ISO/IEC 14496 that defines the syntax and semantics of the bitstreams that convey such scene information, as well as the details of their decoding processes.