Animated video can be the most problematic type of multimedia to process, especially when converting from analog source material. The traditional techniques utilized for live video processing rarely work for anime. At the upper ranges of the quality spectrum, animated content produces extreme amounts of motion, color variation, and fine detail. Even more simplistic
form like cartoons warp the rules governing common sense video processing. Anime can require tedious analysis to produce a quality encoding project, even for the most experienced multimedia experts.
Proper Source Encoding
It is assumed the source is being derived from an uncompressed or lossless video compression routine. HuffYUV or MotionJPEG is recommended when capturing anime. HuffYUV operating in true YUV mode should produce superior results, as MJPEG can exhibit minor artifacting due to lossy
compression algorithms. Uncompressed video is the ideal capture method, but most systems simply lack the space required for uncompressed captures.
Colorspace conversions will be mentioned in various sections of this guide. The colorspace defines how the video is packed according to colors. RGB and YUY2 are the most common, but YUV12 will be employed for the later processing steps. Colorspace conversions are possible, but each conversion further degrades the video stream due to potential mathematical interpolation. For the daring, a lossless YUV12 codec has been developed, but the code is beta and definitely unsupported. Using the VBLE
codec for capture can insure the YUV12 colorspace is utilized throughout the entire capture to final compression process, thus preserving maximum image detail.
Source Video
(Click for larger image) |
The frame rate for PAL video will always be 25 frame per second. Selecting the proper frame rate can prove tedious for NTSC content. DVD backups are simple enough, as a quality conversion utility like DVDx can report
the proper video frame rate. NTSC videotapes (VHS) are 30 fps, but may use telicine techniques to upsample an original 24 fps video stream. Most NTSC VHS films exhibit telecine upsampling techniques.
PCM or ADPCM is recommended for audio capture to ensure maximum quality retention. Direct capture to compressed codecs like MP3 is possible, but this will introduce
significant quality loss due to multiple compression passes. Analog captures should opt for a 44 KHz sample rate, while digital sources can scale upto 48 KHz. The original sample rate must be retained throughout the
entire encoding process from source to final compression. Stereo versus mono can be matched to the source or desired format.
Processing via AVI Synth
Many people avoid AVI Synth due its complex command driven interface. Sadly, no graphical interface application currently offers the advanced capabilities of this scripted multimedia processing application. De-interlacing, resizing, sharpening, and softening routines represent just a small portion of the various filters offered by AVI Synth.
AVI Synth is built atop a scripted command language. The source file is opened and processed via an AVS script. The AVS scripts calls the source file, then applies the filters. AVS files are nothing more than common text files with a ".avs" extension. For example, examine this project's simple "process.avs" script.
AVISource("c:\yuv12source.avi",true,"YV12")
The source command includes two important modifiers. The
"true" statement simply means an audio stream is present. The last convention identifies the desired colorspace for processing video. Colorspace conversions should be avoided, though sometimes it is impossible since not all filters work with all colorspaces. In this instance, the latest beta version of Convolution3d for AS 2.5.1 only supports YV12 processing. Try to limit colorspace conversion to a maximum of two scripted changes to avoid color blending problems. Again, using the VBLE
codec can offset this concern, but remember the format is beta test at best.
Telecine Video
Effective processing of NTSC video can be further complicated by telecine techniques to achieve a 30 frame per second film from a 24 fps source. Most movie studios use this process when performing film to VHS for upsampling of the original film to the 30 fps NTSC standard. The Telecide command offered by the Decomb filter package provides automatic removal of telecine features. Decimate (level 5) is required to specify the proper NTSC frame removal interval to achieve the desired 24 frame per second stream.
AVISource("c:\yuv12source.avi",true,"YV12")
Telecide(Post=false)
Decimate(Cycle=5)
Telecined film is not always possible to easly determine, as some content is actually true 30 frames per second interlaced video. Open the source file
with VirtualDubMod, then start manually scrolling through the film. Examine each five frame sequence closely. Even though the terms are technical, you will quickly noticed how progressive frames and interlaced frames differ when examined closely with VDubMod.
If two interlaced frames and three progressive frames are present, then the video stream requires inverse telecine to restore the film back to the 24 frame per second format. Opt for Telecide and Decimate to remove the telecine operation. Telecide can perform post processing via a de-interlacing routine, though de-interlacing is generally not required when performing inverse telecine upon a decent quality video stream. If all frames exhibit interlacing, then skip this section altogether and move on to de-interlacing the film.
De-Interlacing
Scaled NTSC captures with a vertical height of 240 pixels or under generally do not require de-interlacing. Similarly, the threshold for PAL captures is 288 pixels. These lower resolutions are already showing the effects of blending, thus further de-interlacing offers little benefit in most cases.
The video used in this project is already 24 frames per second, thus no telecine operation needs to be performed. Direct de-interlacing can be utilized instead.
AVISource("c:\yuv12source.avi",true,"YV12")
FieldDeinterlace(full=false)
Various versions of de-interlacing filters are available. AVI Synth can even load VirtualDub filters if so desired. The routine offered by FieldDeinterlace works well and is recommended for most general purpose anime encodes. This algorithm is packaged in the optional decomb filter package. Custom parameters can be defined, though the default configuration works well for a wide range of video sources. The "full" statement is of particular interest. Set to true, the deinterlacing effect will be applied to all frames. Set to false, decomb will selectively determine if interlacing artifacts are present, then only deinterlace when applicable.
Convolution3D
Now the actual hardcore video processing begins. Convolution3D represents a general purpose filter capable of removing a variety of imaging problems. C3D can even serve as a replacement for the myriad of commands that people often try to apply when filtering (sharpen, soften, noise removal, etc.), thus simplifying our lives. The only drawback is that C3D is extremely slow for those with older systems.
AVISource("c:\yuv12source.avi",true,"YV12")
FieldDeinterlace(full=false)
Convolution3d (preset="animeHQ")
The Convulution3D package includes several presets for streamlined operation. Custom values can be defined, but take care, as even a small change in C3D can easily alter a video's appearance. As indicated in the help file, three anime and one VHS related commands are available.
- Convolution3d (preset="animeHQ") // Anime Hi Quality (good DVD source)
- Convolution3d (preset="animeLQ") // Anime Low Quality (noisy DVD source)
- Convolution3d (preset="animeBQ") // Anime Bad Quality
- Convolution3d (preset="vhsBQ") // VHS capture Bad Quality
An additional VHS option is defined for analog capture sources. Anime sourced from VHS tapes generally works well with "animeBQ", though "vhsBQ" may prove more helpful for removing severe artifacts, but overall image quality may be degraded.
Duplicate Frames
The Dup command is a personal choice and depends highly upon the content being processed. Dup detects duplicate frames in a video, then either copies or blends the concurrent frames together. Anime often includes many duplicate frames, especially older content like cartoons. In such situations, Dup can significantly remove most noise related problems while preserving quality. Be careful when selecting a threshold; a low value under 3% is recommended to minimize undesired duplications. Blending seems to works best for slow moving content, as it can sometimes degrade fast movement actions.
AVISource("c:\yuv12source.avi",true,"YV12")
FieldDeinterlace(full=false)
Convolution3d (preset="animeHQ")
Dup(threshold=2, copy=true, blend=true)
Additional Filters
The above defined filers work well with most anime content, though are not to be considered a perfect solution for all animated films. Basic operations like cropping bad pixels, dimension resizing, contrast enhancement, and brightness ramping may need to be applied. Still, a good quality source captured at the desired output resolution will generally not require these types of operations.
Resizing functions can actually work to either slightly soften or sharpen a video if required. Bilinear resizing works using a quick one step process. When down scaling, the result is often a slightly blurred image, thus helping to alleviate fine detail problems like mosquito noise. Bicubic resize is the best choice for up scaling video from a quality source due to an improved two part process. Bicubic can also work to sharpen an image during down scaling operations. Remember to resize/crop video to achieve a final vertical and horizontal dimensions that are multiples of 16 to achieve maximum encoder efficiency.
Video Compression
VirtualDubMod is
recommended for compression with AVI Synth. Just open the .avs script file and VDubMod will take care of the interface concerns. Additional editing can also be performed, though learning the various AVI Synth commands is recommended. MPEG-4 is the codec choice for today's videos, though each of the popular codec choices seem to encode anime quite differently. DivX 5 Professional works well, but disable pyschovisual enhancement. The visual enhancement model can severely distort color changes and rarely helps with compression efficiency of anime content. B-frames can be enabled with little problem, though quarter pixel resolution and global motion compensation usually offers only minimal improvements for animated content.
Legalities aside, XviD is probably the best MPEG-4 choice for anime. As with DivX, disable XviD advanced lumi masking before encoding for maximum image quality. Quarter pixel imaging might improve quality slightly, but the setting is usually not required when using level six motion search precision. Chroma motion and chroma optimization seem to work quite well with anime content, especially for fast action sequences. A small numbers of b-frames can also be configured (1-4), but be sure to set DX50 compatibility for proper playback with the widest range of decoders currently available.
The H.263 quantization method is best pick for most animes, as this setting slightly blurs the image, thus further removing any background noise. MPEG quantization will slightly sharpen the image if too much blurring is already present in the video stream. As usual, the modulated high quality setting is recommended for two pass encoding projects. The new VHQ mode setting should never be configured above level one for anime encodes to avoid image artifacts with one pass projects. Higher levels should only be utilized with two pass encoding techniques. The proper bitrate is subjective at best. Full resolution projects (720x480 NTSC) should opt for a minimum bitrate of 500 Kbps, though upwards of 1000+ Kbps is prefered. A half resolution encode scales much better at low bitrates, oftentimes reaching down to 250 Kbps without extreme quality loss.
Processed Video

(Click for Larger Image) |
In our project's example, AVI Synth will output video in the YV12 colorspace format. Most VirtualDub filters operate only with the RGB format, thus any filtering should be kept to a minimum. If no VDub filtering is desired, then opt for compression via the "fast recompress" method. This will directly output the source to the final video compression format, thus bypassing an unneeded colorspace conversion. Not all codecs properly support YV12, so the codec may actually insert a conversion into the encoding process itself. Luckily, the latest builds of XviD properly handle the YV12 format for superior imaging quality. YV12 is also the fastest encoding option for nearly all MPEG-based routines.
Audio Compression
Regardless of the codec originally used, AVI Synth will pass the audio stream as uncompressed pulse code modulated (PCM) sound to VDubMod. Remember it was recommended to capture the source with PCM or ADPCM audio? AVI Synth is the reason why, as two compression passes with a format like MPEG 1v3 (MP3) would likely distort the source audio stream. Lame MP3 at 128 Kbps stereo (64 Kbps) should surface for all but the most stringent multimedia listener. OGG Vorbis at similar bitrates offers a great alternative, if so desired.
Audio/Video Sync
Depending upon the source material, a slight audio to video offset may exist. The offset value can be determined by playing back the original source with your choice of multimedia players. Setting the audio and video durations to match usually takes care of most a/v sync problems.
Otherwise, YAAI can be used to determine a new frame rate value or an an audio offset value. The frame rate value can be specified in the video frame rate control dialog. The audio offset value can be entered the audio skew setting within the audio/video interleave option dialog of VDubMod.
Container Format
VirtualDubMod supports both AVI and OGG multimedia container formats. The industry standard AVI format is general purpose, well established, playable upon a variety of devices, and editable with multitudes of software applications. The OGG format is relatively unsupported, but is definitely the superior choice for PC-only use, especially when opting for a variable bitrate encoding method, such as the algorithms offered by the LAME MP3 audio format. Take your pick, as either will suffice for those only interested in video archiving for playback purposes with a personal computer.
Preview Mode
VirtualDubMod allows for real-time preview of an encoding project before final rendering. The preview feature is accessible from the file menu. The video compression method must be configured to full processing mode. Once launched, the preview interface allows for the simultaneous viewing of both the input source stream and the final processed video. Render performance can be slow in this mode, so be patient when performing analysis. Assuming everything looks acceptable, then cancel the preview interface, switch back to fast recompress mode, then save the project to the final output file.
A/V Synch Revisited
Yes, we are back to this subject once again. In extremely rare cases the final compressed file can suffer an audio to video synchronization problem. The fix works in the same manner as the previous section. The only difference is that direct stream copy can be selected for both the video and audio streams in VDubMod, thus there is no need to re-encode.
Quality Comparison
Naturally, a high quality source provides a high quality encode. The top is the source video, the bottom is the processed video.
Quality Comparison

(Click for Larger 2x Scaled Image) |
Final Thoughts
Animated content can prove problematic to process and encode. The rules associated with live action content often do not work for anime. Worst yet, many enthusiasts distribute captured material without any form of processing applied. At minimum, even a simple de-interlacing routine should be used before distribution, even if it is just the internal simple de-interlacer offered with DivX 5 Professional during compression. For the worst encodes, processing and recompression is not advisable due to quality loss, but a quality post processing filter like ffdshow can help to alleviate image problems associated with poor quality video streams.
To conclude, I will soon have an inbox full of emails proclaiming the techniques described in this article are incorrect or inaccurate. At the worst case, probably a few flames will arrive as well. To clarify this situation, no single technique can properly process all video streams, regardless of similarities between the content. Hundreds of pages could be devoted to AVI Synth alone, as the whole premise of this script driven utility is to be as simple or complex as the end-user feels comfortable using. Accordingly, this simplified article serves more as a general purpose guide, not as a step-by-step instructional, to processing animated video content.