Faculty of Creative Multimedia
Multimedia University, Cyberjaya, Selangor 63100
Phone: +60.16.626 0634    Fax: +60.3.8312 5554
Print This Post Print This Post

Lecture Series 06 :: Week 05

Fundamentals of digital video

Students will explore the fundamentals of digital video on the desktop which include its characteristics, file formats, video compression, and video storage options. Students will also be exposed to video editing and compositing techniques.

The term video editing can refer to:

  • non-linear editing system, using computers with video editing software
  • linear video editing, using videotape
  • vision mixing, when working with live video signals

Video editing is the process of editing segments of motion video footage, special effects and sound recordings. Motion picture film editing is a predecessor to video editing and, in several ways, video editing simulates motion picture film editing, in theory and the use of non-linear and linear editing systems. Using video or film, a director can communicate non-fictional and fictional events. The goals of editing is to manipulate these events for better or for worse communication. It is a visual art.

Modern non-linear editing systems are computer-based, though there was a transitional analog period using multiple source VCRs or LaserDisc players. Footage is played and captured on a hard drive. Content is ingested and recorded natively in the approriate codec which will be used by software such as Sony Vegas Pro, MAGIX Video Pro X, Avid’s Media Composer and Xpress Pro, Apple’s Final Cut Pro, and Adobe’s Premiere to manipulate the captured footage. High definition video is becoming more popular and can be readily edited using the same software along with related motion graphics programs. Clips are arranged on a timeline, music tracks and titles are added, effects can be created, and the finished program is “rendered” into a finished video. The video may then be distributed in a variety of ways including DVD, web streaming, Quicktime Movies, iPod, CD-ROM, or videotape.

For the home market, consumer-friendly products such as MAGIX Movie Edit Pro, Adobe Premiere Elements, AVID Express DV, CyberLink PowerDirector, Final Cut Express, Sony Vegas, Pinnacle Studio, ULead VideoStudio, Roxio Easy Media Creator, and muvee autoProducer have come on the market with the emergence of computer video editing for the home PC. Two free programs that are bundled with computers are Apple’s iMovie and Microsoft’s Windows Movie Maker. There are many other free opensource video-editing software, too.

Description of video

The term video (from Latin: “I see”) commonly refers to several storage formats for moving pictures: digital video formats, including Blu-ray Disc, DVD, QuickTime, and MPEG-4; and analog videotapes, including VHS and Betamax. Video can be recorded and transmitted in various physical media: in magnetic tape when recorded as PAL or NTSC electric signals by video cameras, or in MPEG-4 or DV digital media when recorded by digital cameras.

Quality of video essentially depends on the capturing method and storage used. Digital television (DTV) is a relatively recent format with higher quality than earlier television formats and has become a standard for television video. (See List of digital television deployments by country.)

3D-video, digital video in three dimensions, premiered at the end of 20th century. Six or eight cameras with realtime depth measurement are typically used to capture 3D-video streams. The format of 3D-video is fixed in MPEG-4 Part 16 Animation Framework eXtension (AFX).

In the UK, Australia, The Netherlands, Finland, Hungary and New Zealand, the term video is often used informally to refer to both Videocassette recorders and video cassettes; the meaning is normally clear from the context.

Characteristics of video streams

Number of frames per second

Frame rate, the number of still pictures per unit of time of video, ranges from six or eight frames per second (frame/s) for old mechanical cameras to 120 or more frames per second for new professional cameras. PAL (Europe, Asia, Australia, etc.) and SECAM (France, Russia, parts of Africa etc.) standards specify 25 frame/s, while NTSC (USA, Canada, Japan, etc.) specifies 29.97 frame/s. Film is shot at the slower frame rate of 24photograms/s, which complicates slightly the process of transferring a cinematic motion picture to video. The minimum frame rate to achieve the illusion of a moving image [persistence of vision] is about fifteen frames per second.

Display resolution

The size of a video image is measured in pixels for digital video, or horizontal scan lines and vertical lines of resolution for analog video. In the digital domain (e.g. DVD) standard-definition television (SDTV) is specified as 720/704/640×480i60 for NTSC and 768/720×576i50 for PAL or SECAM resolution. However in the analog domain, the number of visible scanlines remains constant (486 NTSC/576 PAL) while the horizontal measurement varies with the quality of the signal: approximately 320 pixels per scanline for VCR quality, 400 pixels for TV broadcasts, and 720 pixels for DVD sources. Aspect ratio is preserved because of non-square “pixels”.

New high-definition televisions (HDTV) are capable of resolutions up to 1920×1080p60, i.e. 1920 pixels per scan line by 1080 scan lines, progressive, at 60 frames per second.

Aspect ratio

Aspect ratio describes the dimensions of video screens and video picture elements. All popular video formats are rectilinear, and so can be described by a ratio between width and height. The screen aspect ratio of a traditional television screen is 4:3, or about 1.33:1. High definition televisions use an aspect ratio of 16:9, or about 1.78:1. The aspect ratio of a full 35 mm film frame with soundtrack (also known as the Academy ratio) is 1.375:1.

Ratios where the height is taller than the width are uncommon in general everyday use, but do have application in computer systems where the screen may be better suited for a vertical layout. The most common tall aspect ratio of 3:4 is referred to as portrait mode and is created by physically rotating the display device 90 degrees from the normal position. Other tall aspect ratios such as 9:16 are technically possible but rarely used. (For a more detailed discussion of this topic please refer to the page orientation article.)

Pixels on computer monitors are usually square, but pixels used in digital video often have non-square aspect ratios, such as those used in the PAL and NTSC variants of the CCIR 601 digital video standard, and the corresponding anamorphic widescreen formats. Therefore, an NTSC DV image which is 720 pixels by 480 pixels is displayed with the aspect ratio of 4:3 (which is the traditional television standard) if the pixels are thin and displayed with the aspect ratio of 16:9 (which is the anamorphic widescreen format) if the pixels are fat.

Non-linear editing system

Non-linear editing

Non-linear editing for films and television postproduction is a modern editing method which involves being able to access any frame in a digital video clip with the same ease as any other. This method is similar in concept to the “cut and paste” technique used in film editing from the beginning. However, the cutting of film negatives made it originally a destructive process. Non-linear, non-destructive methods began to appear with the introduction of digital video technology. It can also be viewed as the audio/video equivalent of word processing, which is why it is called desktop editing in the consumer space [1].

Video and audio data are first captured to hard disks or other digital storage devices. The data is either recorded directly to the storage device or is imported from another source. Once imported they can be edited on a computer using any of a wide range of software. For a comprehensive list of available software, see List of video editing software, whereas Comparison of video editing software gives more detail of features and functionality.

In non-linear editing, the original source files are not lost or modified during editing. Professional editing software records the decisions of the editor in an edit decision list (EDL) which can be interchanged with other editing tools. Many generations and variations of the original source files can exist without needing to store many different copies, allowing for very flexible editing. It also makes it easy to change cuts and undo previous decisions simply by editing the edit decision list (without having to have the actual film data duplicated). Loss of quality is also avoided due to not having to repeatedly re-encode the data when different effects are applied.

Compared to the linear method of tape-to-tape editing, non-linear editing offers the flexibility of film editing, with random access and easy project organization. With the edit decision lists, the editor can work on low-resolution copies of the video. This makes it possible to edit both standard-definition broadcast quality and high definition broadcast quality very quickly on normal PCs which do not have the power to do the full processing of the huge full-quality high-resolution data in real-time.

The costs of editing systems have dropped such that non-linear editing tools are now within the reach of home users. Some editing software can now be accessed free as web applications; some, like Cinelerra (focused on the professional market) and Blender3D, can be downloaded free of charge; and some, like Microsoft’s Windows Movie Maker or Apple Computer’s iMovie, come included with the appropriate operating system.

A computer for non-linear editing of video will usually have a video capture card to capture analog video and/or a FireWire connection to capture digital video from a DV camera, with its video editing software. Modern web based editing systems can take video directly from a camera phone over a GPRS or 3G mobile connection, and editing can take place through a web browser interface, so strictly speaking a computer for video editing does not require any installed hardware or software beyond a web browser and an internet connection.

Various editing tasks can then be performed on the imported video before it is exported to another medium, or MPEG encoded for transfer to a DVD.

Quality

One of the primary concerns with non-linear editing has always been picture and sound quality. The need to compress and decompress video leads to some loss in quality. While improvements in compression techniques and disk storage capacity have reduced these concerns, they still exist. Most professional NLEs are able to edit uncompressed video with the appropriate hardware.

With the more recent adoption of DV formats, quality has become an issue again: DV’s compression means that manipulation of the image can introduce significant degradation. However this can be partially avoided by rendering DV footage to a non-compressed intermediary format, thereby avoiding quality loss through recompression of the modified video images. Ultimately it depends on what changes are made to the image; simple edits should show no degradation; however, effects that alter the colour, size or position of parts of the image will have a more negative effect.

Lossless data compression

Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange for better compression rates.[citation needed]

Lossless data compression is used in many applications. For example, it is used in the popular ZIP file format and in the Unix tool gzip. It is also often used as a component within lossy data compression technologies.

Lossless compression is used when it is important that the original and the decompressed data be identical, or when no assumption can be made on whether certain deviation is uncritical. Typical examples are executable programs and source code. Some image file formats, like PNG or GIF, use only lossless compression, while others like TIFF and MNG may use either lossless or lossy methods.

Lossless compression techniques

Most lossless compression programs do two things in sequence: the first step generates a statistical model for the input data, and the second step uses this model to map input data to bit sequences in such a way that “probable” (e.g. frequently encountered) data will produce shorter output than “improbable” data.

The primary encoding algorithms used to produce bit sequences are Huffman coding (also used by DEFLATE) and arithmetic coding. Arithmetic coding achieves compression rates close to the best possible for a particular statistical model, which is given by the information entropy, whereas Huffman compression is simpler and faster but produces poor results for models that deal with symbol probabilities close to 1.

There are two primary ways of constructing statistical models: in a static model, the data is analyzed and a model is constructed, then this model is stored with the compressed data. This approach is simple and modular, but has the disadvantage that the model itself can be expensive to store, and also that it forces a single model to be used for all data being compressed, and so performs poorly on files containing heterogeneous data. Adaptive models dynamically update the model as the data is compressed. Both the encoder and decoder begin with a trivial model, yielding poor compression of initial data, but as they learn more about the data performance improves. Most popular types of compression used in practice now use adaptive coders.

Lossless compression methods may be categorized according to the type of data they are designed to compress. While, in principle, any general-purpose lossless compression algorithm (general-purpose meaning that they can compress any bitstring) can be used on any type of data, many are unable to achieve significant compression on data that is not of the form for which they were designed to compress. Many of the lossless compression techniques used for text also work reasonably well for indexed images.

HDV

HDV is a format for recording and playback of high-definition video on a DV cassette tape.[1] The format was originally developed by JVC and was supported by Sony, Canon and Sharp.[2] The four companies formed the HDV consortium in September 2003. Conceived as an affordable high definition format, HDV quickly caught on with many professional users due to its low cost, portability and image quality acceptable for many professional productions.

Two major versions of HDV are HDV 720p and HDV 1080i. The former is used by JVC and is informally known as HDV1. The latter is preferred by Sony and Canon and is sometimes referred to as HDV2.[3] The HDV 1080i defines optional progressive recording modes, and in recent publications is often called HDV 1080 or 1080-line HDV as progressive 1080-line recording becomes commonplace.[4][5]

Most HDV camcorders use “small” MiniDV/DVC cassettes. Some shoulder-mount camcorders are also capable of recording onto “large” DV/DVCAM cassettes. The recording time is the same as DV Standard Play. Unlike DV, HDV does not offer Long Play speed.

HDV is backwards compatible with DV, meaning that HDV equipment can play and record DV content. On the other hand, DV devices cannot play nor record in HDV format.

HDV and HDV logo are trademarks of Sony and JVC.[6]

HDV 720p

HDV 720p closely matches broadcast 720p progressive scan video standard in terms of scanning type, frame rate, frame size, aspect ratio and data rate. First HDV 720p camcorders could shoot only with 24, 25 and 30 frames per second, but later models remedied this issue, providing true 50p/60p recording modes.

Presently, JVC is the only manufacturer of HDV 720p camcorders. JVC was the first to release an HDV camcorder, the handheld GR-HD1. Later JVC shifted its HDV development to shoulder-mounted cameras.

A common misconception is that JVC developed a proprietary extension to HDV called ProHD, featuring film-like 24-frame/s progressive recording mode and LPCM audio, for professional use. JVC has clarified that ProHD is not a video recording format, but “an approach for delivering affordable HD products” and a common name for “bandwidth efficient professional HD models”.

HDV 1080i

Being a major manufacturer of interlaced video equipment, Sony adapted HDV, originally conceived as progressive-scan format by JVC, to interlaced video. Instead of using 1920×1080 frame size with square pixels HDV 1080i utilizes 1440×1080 frame with 1.33 pixel aspect ratio. Such downsampling is not unique to HDV 1080i, it is used in other high definition video recording standards like HDCAM or DVCPROHD to reduce the amount of information to be recorded.

Interlaced video has been a useful compromise for decades due to its ability to display motion smoothly while reducing recording and transmission bandwidth. Interlaced video is still being used in acquisition and broadcast, but interlaced display devices are being phased out. Modern flat-panel television sets that utilize plasma and LCD technology are inherently progressive. All modern computer monitors use progressive scanning as well. Interlaced video must be converted to progressive before it is displayed on a progressive-scan device. The process of converting interlaced video into progressive is known as deinterlacing. Progressive-scan television sets employ built-in deinterlacing circuits to cope with interlaced broadcast signal, but computers rarely have this capability. Interlaced video often exhibit ghosting or combing artifacts when watched on a computer.

Some HDV 1080i camcorders are capable of recording progressive video within an interlaced stream, provided that the frame rate does not exceed half of the field rate. The first HDV 1080i camcorder to implement such Progressive Scanning was the Sony HVR-V1.[7] To preserve compatibility with interlaced equipment the HVR-V1 records and outputs video in interlaced form. 25-frame/s and 30-frame/s progressive video is recorded on tape using progressive segmented frame (PsF) technique, while 24-frame/s recording employs 2-3 pulldown. The camcorder offers two variations of 24-frame/s recording: “24″ and “24A”. In “24″ mode the camera ensures that there are no cadence breaks for a whole tape, this mode works better for watching video directly from the camera and for adding “film look” to interlaced video. In the “24A” mode the camera starts every clip on an A frame with timecode set to an even second margin.[8][9] Several editing tools, including Sony’s own Vegas, are capable of processing 24A video as proper 24 frames/s progressive video.[10]

Prior to the HVR-V1, Sony was offering Cineframe, essentially an interlaced-to-progressive converter, to simulate film-like motion. The conversion process involved blending and discarding fields, so vertical resolution of the resulting video suffered. Motion, produced in the 24-frame/s variant of Cineframe was too uneven for professional use.[11] The same or better film look effect can be achieved by converting regular interlaced video into progressive format using computer software.[12]

In 2007 Canon commoditized progressive scanning, releasing the HV20 camcorder. The version for 50 Hz market featured PF25 mode with PsF-like recording, while the version for 60 Hz market had PF24 mode, which utilized 2-3 pulldown scheme. The HV30, released in 2008, implemented additional PsF-like PF30 mode for 60 Hz markets. Output is performed via component, HDMI and FireWire in interlaced form.[13]

To achieve full vertical resolution without introducing interlace artifacts the progressive scan video must be properly deinterlaced. 25P and 30P video must be deinterlaced with “weave” or “no deinterlacing” algorithm, which means joining two fields of each frame together into one progressive frame. 24P video must go through film-mode deinterlacing also known as inverse telecine, which throws out judder frames and restores original 24-frame/s progressive video.

HDV 1080p

The original 1080-line HDV specification defined interlaced recording only, which is suitable for television broadcast. As users have become increasingly interested in digital cinematography and in web videos, progressive recording became a necessity. In response to this need, capability for native progressive recording has been added to the 1080i HDV specification. Progressive recording modes are optional for 1080i HDV devices, which means that not every HDV 1080i camcorder or deck is capable of recording or playing back native progressive video. Because HDV 1080i specification now includes both interlaced and progressive recording modes, in recent publications it is often called HDV 1080 or 1080-line HDV, but the official name still bears the “i” suffix.

HDV camcorders capable of native 1080-line progressive video record it at rates of 24 frame/s (actually 23.98 frame/s) and 30 frame/s (actually 29.97 frame/s) for 60 Hz markets, and at 25 frame/s rate for 50 Hz markets. Video is output as true progressive video via an i.LINK/Firewire port. Output through other ports is performed in interlaced mode to preserve compatibility with existing interlaced equipment.[14][15]

The first 1080-line HDV camcorder to offer recording in native progressive format was the Canon XL H1, introduced in 2006. It was followed by the the XH-G1 and XH-A1. When shooting in progressive mode, also known as Frame mode, the camcorders generate progressive video from interlaced CCD sensors.[16] Vertical resolution of the resulting video is about 25% lower than theoretically possible because of row-pair summation, but is still higher than the resolution of a single field.[17] Video shot in Frame Mode is recorded to tape according to HDV native 1080p specifications.

In 2008 Sony released its own models capable of native progressive recording: the HVR-S270, the HVR-Z7 and the HVR-Z5. Sony claims superiority over Canon models by saying that native progressive recording has been called 24F/25F/30F in some camcorders, which actually use interlaced CCD imagers.[4] Sony stresses that the progressive-scan CMOS sensors used in its new models create true 1080p images, meaning that the signal is processed as progressive all the way from capture to encoding to recording onto tape to output.[18]

In 2009 Canon released the HV40. Its 60 Hz variant became the first consumer HDV camcorder to feature 24-frame/s native progressive recording. Like the aforementioned Sony models, the HV40 uses progressive-scan sensor.[19]

Sony designed Native Progressive Recording logo for the devices that are capable of native progressive recording and playback. Canon has no special logo to identify cameras that can record in “F” modes, though the HV40 camcorder bears 24p native progressive mark. Despite differences in branding, 24F/25F/30F modes offered by Canon and Native Progressive Recording offered by Sony are compatible.[20][21][22][23]

Other HDV devices capable of reading and recording in native progressive 1080-line format include the Sony HVR-M15AU, HVR-25AU,[24] HVR-M15AE, HVR-25AE[25] and HVR-M35 HDV videocassette recorders, and the Canon HV20/HV30 camcorders when used in tape recorder mode.

Compatibility between brands

Generally, HDV devices are capable of playing and recording in DV format, though this is not required by HDV specification. Many HDV devices manufactured by Sony are capable of playing and recording DVCAM tapes. 1080-line devices generally are not compatible with 720-line devices, though some standalone tape decks accept both HDV formats. Devices that can play and record native 1080p video can play and record native 1080i video, however the opposite is not always the case.

HDV camcorders are usually offered with either 50 Hz or 60 Hz scanning rate, but some models can be made switchable for “world” capability. In particular, Canon XH-A1/G1 models and third-generation Sony models such as HVR-S270, HVR-Z5 and HVR-Z7, can be upgraded.[26]

Picture 1

HDV compression

HDV video is compressed with MPEG-2 encoder and is wrapped into transport stream. HDV audio is compressed using MPEG-1 Layer 2 compression scheme.

MPEG-2 is an established compression method used in DVD-Video and in many digital TV broadcast formats, in particular ATSC. HDV 1080i uses a recording data rate of 25 Mbit/s while HDV 720p records at 19.7 Mbit/s. In both cases the data rate is constant because the recording media — tape — is transported with constant speed. Constant data rate limits the video quality in scenes with lots of detail, rapid movement or other complex activity like flashing lights. Such scenes may exhibit visible artifacts such as blockiness or blurring, depending on the amount of movement and on the algorithm employed in the encoder.

MPEG-1 Layer 2 compression used for audio allows reducing the audio bitrate to 384 kbit/s, compared to 1536 kbit/s for DV video and 1411 kbit/s for audio CDs. In most cases, HDV audio is not a significant limiting factor and is considered perceptually lossless.

Use of HDV in broadcast television

HDV is accepted with varying restrictions for broadcast TV use. It has been used for shows like “Deadliest Catch” and “MythBusters”, and was used in the TV series “JAG” for scenes where larger HD cameras would have been impractical.

The BBC currently considers HDV to be a “non-HD” format. With advance approval, the BBC accepts HDV footage for up to 25% of HD programming content, if the contributed material meets one of the five “technical exemption” categories: Artistic interest, Historical interest, Actuality material, Early television and cinema or Home videos.[27] The BBC has adopted HDV cameras as replacement for DV camcorders to produce widescreen standard definition content.[28]

The Discovery HD Theater accepts content sourced from 1080-line HDV camcorders, but limits it to 15% of a whole program. Producers wishing to use HDV are required to submit an approved postproduction path outlining their handling of the footage in the editing process.[5] However, the main Discovery Channel’s HD simulcast has fewer or no guidelines and accepts a mix of XDCAM HD, HDV and AVCHD for the length of a program. For example, Discovery Channel aired 911: The Bronx, a six-episode reality series set in a hospital and shot with HDV cameras.[29][30]

The PBS requires that for HD broadcast the camera must use three CCD imaging sensors, each with at least a 1/2-inch diagonal and a minimum resolution of 1280 x 720. PBS does not list specific encoding formats and data rates, but requires that compression artifacts “must not be obvious when viewed on an HDTV monitor”. For certain circumstances PBS allows usage of “less than full broadcast quality equipment”.[31] In particular, the Art Wolfe’s TV series “Travels to the edge” is being produced for PBS in HDV format using Canon XL-H1 camcorders.[32]

Guiding Light, the longest-running soap opera in production in television and radio history, broke away from traditional three-sided sets and pedestal-style cameras in 2008, choosing the handheld Canon XH-G1 for shooting on practical locations.[33]


This entry was posted on Tuesday, November 10th, 2009 at 2:31 pm and is filed under Lectures and Notes, Weekly Lecture Notes. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

You must be logged in to post a comment.