Integrating Video And Graphics

Many DTV receivers have a limited range of capabilities when it comes to supporting video and graphics. The HAVi UI model, with its separate devices for video, background and graphics, reflects these capabilities to some extent, but there are elements that digital TV developers need to be aware of. If you're not an experienced developer of STB software, these limitations may not be obvious at first.

A set-top box is not a PC

In a typical PC platform, video and graphics are all rendered in software (even if the video is decoded in hardware, the rendering is usually handled by software). This makes life easy when it comes to scaling and positioning video - since everything is done in software, the video is just like any other content that can be scaled, clipped and repositioned. Since it's just another piece of data to be rendered, it's easily integrated into the windowing system and coexists easily with the graphics as long as the host CPU is fast enough.

In an STB, things don't work this way. Modern set-top boxes are very highly integrated pieces of electronics, and hardware specifications are driven downwards by cost pressures from network operators and consumers. The effect of this is twofold. First, an MHP or OCAP receiver will typically have its graphics processor, MPEG demultimplexer and decoder, and CPU integrated into a single chip. This chip will usually have graphics capabilities that are targetted specifically at the digital TV market, and so more general-purpose tasks may be less well supported. The cost pressures mean that there is no real reason to support these general purpose tasks unless they are directly related to the features needed by a digital TV receiver, and so STBs often have a fairly specialised graphics subsystem that supports several graphics planes, each designed for a specific task.

Graphics planes in the receiver hardware

Before we look at the limitations this model imposes on us as developers, it's worth identifying the types of planes that we typically have in the video and graphics subsystem:

  • Background plane.
  • Still image plane. Provides the capability to display a static image behind decoded video or other graphics. Only a limited number of image formats will be supported, typically MPEG-2 I-frames.
  • Video plane. The plane where decoded video is displayed. Some processors may have more than one video plane to support decoding multiple video streams or to support picture-in-picture.
  • Graphics plane. Used for graphics displayed by an application, or for subtitles. Some chipsets may have more than one graphics plane for different purposes (e.g. application graphics, subtitles and general OSD), and these may have different capabilities in terms of colour depth and resolution.
  • Cursor plane. Used to display a hardware cursor.

Not every receiver will have all of these planes, and some will have substantially more. Most modern DTV chipsets will have between 5 and 13 planes in total, depending on their capabilities and cost.

We have already seen that an MHP or OCAP receiver will logically group these into three different layers - the background layer for single-colour and still-image backgrounds, one or more video layers for video decoding and one or more graphics layers for graphics display. The mapping between graphics planes in the underlying hardware and MHP or OCAP graphics layers is not strictly defined, and so more than one physical graphics plane may get merged into a single MHP graphics layer. Conversely, a system that uses software MPEG decoding may map a singel physical graphics plane on to both video and graphics layers in its middleware implementation.

The specialised nature of these graphics planes (as well as the cost pressures on the chip) means that there may be some limitations or dependencies in place between the different graphics planes. The first of these limitations is the difference in capabilities between differen hardware platforms. Some graphics hardware will support more colours, others will only support a basic set. Even the shape of the pixels may be different: video pixels are typically not square, while computer graphics pixels typically are, hence the reason a 4:3 aspect ratio computer display may be 800x600 pixels, while a 4:3 TV screen will be 720x625 pixels for PAL resolutions. This means that graphics may not perfectly align with the displayed video, and this is something that developers need to be aware of.

The video plane may impose its own set of limitations on the developer. Since the receiver hardware is driven mainly by cost, scaling and positioning operations may be limited. Some MPEG decoders will have the limitation that the video can only be scaled to a certain size (for instance full-screen or quarter-screen resolutions), ort may only be positioned in certain areas. A common restriction is that video may not be positioned off-screen, or may only be scaled to sizes that are an even number of pixels. For developers who are used to being able to place object off-screen, working around this can be a challenge. Workarounds are possible, but they require a slightly different way of approaching a problem. We'll look more at this particular issue later in this section.

As we've already mentioned, subtitles may be another limiting factor. Some devices may limit the placement of subtitles, and the colours available to them. Others may reserve an amount of the screen for subtitles, and this cannot be used for other purposes.

Finally, the interaction between the various planes may impose further limitations. Changing settings for the graphics plane may affect the other planes, and changing settings for other planes may affect the graphics plane. A good example of this is the relationship between the video and background planes. While the MHP specification says that the background plane may contain either a solid colour or a still image (in the form of an MPEG I-feame), some graphics hardware limits this. In many cases, the hardware MPEG decoder is needed to decode the background image. However, while it's doing this, it can't decode the video, and so displaying a background image may cause some disruption to video decoding. In some cases, this disruption lasts only as long as it takes to decode the background image, while other hardware reserves the MPEG decoder for as long as the background image is displayed.

The biggest message to take from video/graphics integration in MHP is that receivers may have very different capabilities, and so an application needs to be able to handle these differences between platforms gracefully. Sometimes, the application will have no choice but to change the way it uses the display.

Video/graphics blending

MHP and OCAP support transparent and partially-transparent components overlaid on video or on the background images. These can include other images, AWT components or just about anything else.

There are two approaches that can be taken to video/graphics blending. The HAVi API provides one approach, while the MHP APIs themselves define an alternative approach that may be more flexible in some circumstances. We'll look at the MHP approach first, which is also supported in OCAP.

The first this we have to know is that in MHP and OCAP, graphics contexts aren't standard AWT graphics contexts. In most Java implementations, the graphics context is represented by the java.awt.Graphics class. This isn't quite flexible enough for the digital TV world, however, and so the MHP specification states that all instances of java.awt.Graphics returned by the middleware must actually be instances of org.dvb.ui.DVBGraphics objects. OCAP inherits this requirement from MHP. The DVBGraphics class is a subclass of java.awt.Graphics, and so any operations on the graphics context will still work correctly, but the DVBGraphics class gives us some new and useful functionality. Included in this new functionality is the ability to set compositing rules that dictate how the contents of the graphics content interact with other graphics planes to produce the final image.

The org.dvb.ui.DVBAlphaComposite class lets us define these rules for a graphics context (and provides a list of constants that let us refer to these rules). For the moment, we won't go in to the rules themselves - we only need to know that several different rules may be available to us. We'll take a closer look at the rules themselves later in this tutorial.

The constructor of this class takes two parameters - an integer (to define the compositing rule) and a floating-point value (to define the alpha value). While advanced systems may support a range of alpha values for blending, more basic systems may only support alpha values of 0, 1 and approximately 0.3 (i.e. fully transparent, fully opaque and approximately 30% transparent - for more information about transparency support in MHP and OCAP, see the graphics tutorial). A set of constants representing the compositing rules are also defined in this class, and we will see these in more detail later in this tutorial.

// get our DVBAlphaComposite object. Change the values as
// you need to. Note that using this rule can be VERY
// SLOW.
DVBAlphaComposite compositeRule;
compositeRule = DVBAlphaComposite.getInstance(
  SRC_OVER, 0.5);

// Get the DVBGraphics object for the component we want
// to blend
DVBGraphics myGraphics;
myGraphics = (org.dvb.ui.DVBGraphics)
  myComponent.getGraphics();

// Now set the compositing rule, and probably watch our
// system slow to a crawl.
myGraphics.setDVBComposite(compositeRule);

Not all compositing rules described in the specification may be available, depending on the hardware. By calling org.dvb.ui.DvbGraphics.getAvailableCompositeRules(), the application can get an array of integers representing the supported compositing rules. These compositing rules and their effects are described later in this tutorial.

The code below will draw a partially-transparent image (stored in the variable myImage) into the graphics context stored in myGraphicsContext:

// Create a compositing rule with an alpha value of 50%.
// This uses the SRC_OVER blending rulew that we will see
// later in the tutorial
DVBAlphaComposite compositingRule;
compositingRule = DVBAlphaComposite.getInstance(
    DVBAlphaComposite.SRC_OVER,
    (float)0.5);

// Convert the graphics context that we've got (e.g. from
// a call to java.awt.Component.getGraphics()) into a
// DVBGraphics object.  It will actually already be a
// DVBGraphics object, but this makes sure.
org.dvb.ui.DVBGraphics myGraphics;
myGraphics = (DVBGraphics)myGraphicsContext;

// This can throw an exception if the blending mode is
// not supported.  In this case, we're using a blending
// mode that will always be supported, but we still have
// to try to catch the exception.
try {
  myGraphics.setDVBComposite(compositeRule);
}
catch (UnsupportedDrawingOperationException ex);
{
  // Handle the exception in the way you want to
}

// Draw the image in the top left corner of the graphics
// context. We will assume that the current class is the
// observer for this operation
myGraphics.drawImage(myImage,0,0,this);

Some image formats such as PNG and GIF will support transparency as part of the format. This is supported in MHP and OCAP receivers, but may be limited by the capabilities of the underlying hardware.

HAVi Mattes

As we said earlier, if your application is using the HAVi UI API, then another option is available. The org.havi.ui.HMatte interface and its subclasses provide a way for applications to perform alpha-blending operations on HAVi components. There are two main differences between the HAVi approach and the DVB approach. First, the DVB approach works on the graphics context level, whereas the HAVi approach works on the component (or more accurately, HComponent) level. Second, the HAVi approach does not use the Porter-Duff compositing rules.

The HAVi UI API defines several subclasses of HMatte, each with a different effect. These effects rage from applying a simple alpha value to the whole component, to applying alpha values based on an image map, to applying alpha values that vary over time based on a set of image maps. Receivers may support the following subclasses of HMatte:

Class Effect
HMatte None (base class)
HFlatMatte Constant alpha level across the entire component
HImageMatte Alpha level varies in location, based on an image map
HFlatEffectMatte Alpha level varies in time, either based on a specified delay or under application control
HImageEffectMatte Alpha level varies in time (based on a specified delay or under application control) and also in location (based on an image map)

For those mattes where the effect can vary over time, there are number of ways of changing the effect. The effect is treated as an animation, with an array of values representing the different frames. For an HFlatEffectMatte, these values are floating point values, while for an HImageEffectMatte they are a set of java.awt.Image objects. The animation may either be controlled by the middleware (with a specified delay between 'frames' and with the applcation choosing to start and stop the animation), or it can be controlled directly by the application using the setPosition() method on the matte to go to the appropriate 'frame'. The only thing that you need to remember as an application writer is that while the data for the animations can be set at any time, doing so will reset the animation to the beginning of its cycle.

The setMatte() and getMatte() methods on the org.havi.ui.HComponent class allow the application to set and get the matte for that component. Once the matte has been created and set, no more action is needed by the application.

Although HAVi mattes don't explicitly use the Porter-Duff rules, their effect is typically the same as compositing with the SRC_OVER rule. This does lead to the potential for a fairly large performance impact when using these APIs.

Blending rules in MHP and OCAP

MHP receivers can support a selection of methods for video/graphics blending and compositing, based on the Porter-Duff compositing rules. As with so many of the graphics features in MHP and OCAP, the functionality that's available to application authors depends on the hardware capabilities of the receiver. The receiver has to support a minimum of three of the compositing rules, and may support up to eight.

In the following examples, the white rectangle is first rendered into an offscreen buffer. The red circle is then composited in to the same buffer using the appropriate compositing rule. The combines image is then composited with the video using the SRC_OVER rule.

Rule Mandatory Effect Constant
CLEAR Yes The CLEAR blending rule DVBAlphaComposite. CLEAR
DST_IN No The DST_IN blending rule DVBAlphaComposite. DST_IN
DST_OUT No The DST_OUT blending rule DVBAlphaComposite. DST_OUT
DST_OVER No The DST_OVER blending rule DVBAlphaComposite. DST_OVER
SRC Yes The SRC blending rule DVBAlphaComposite. SRC
SRC_IN No The SRC_IN blending rule DVBAlphaComposite. SRC_IN
SRC_OUT No The SRC_OUT blending rule DVBAlphaComposite. SRC_OUT
SRC_OVER Yes The SRC_OVER blending rule DVBAlphaComposite. SRC_OVER

As you can see from the diagram, each of these rules affects the final result in a different way. Because of this, the CPU load required by these operations is different - some may be quick, while others can be very slow. One of the most potentially useful rules, SRC OVER is extremely slow and so developers should take care when using this rule. The relative performance of the various compositing methods is shown below, from fastest to slowest:

  • CLEAR and SRC
  • SRC_IN and DST_IN
  • SRC_OUT and DST_OUT
  • SRC_OVER and DST_OVER

While this is not guaranteed to be completely accurate, it gives you an idea of what operations should be used sparingly.

Some of these rules appear to have fairly similar effects (in particular DST IN and DST OUT), but there are differences. Whether these differences are important to you is really something that only you as a developer or graphic designer can know.

Overscan and the safe area

A final thing to remember for application developers used to PC development is the phenomenon of overscan. The picture on a TV may be slightly larger than the viewable area of the screen, in order to avoid artifacts from the data signals carried in the first few lines of the TV picture or other elements specific to TV signals. What this means for developers is that any graphics displayed at the very edge of the picture may not be visible on screen - typically, up to 5% of the image can be lost from each border. The area within this area is known as the safe area, and any graphics within this area are guaranteed to appear in the viewable picture.

The size of this safe area will change between NTSC and PAL television systems. Since there is a 100-line difference in the vertical resolution between the two systems (625 lines for PAL vs. 525 for NTSC), developers must make the decision whether to target PAL or NTSC systems separately, or whether to design their graphics so that they display equally well on both systems. Safe areas are discussed in more detail in the tutorial on designing user interfaces for a TV screen.

Aspect ratios

As if these factors weren't already enough to worry about, there's also the issue of aspect ratios to consider. While most TVs today are still 4:3 aspect ratio, this is changing and 16:9 aspect ratio TVs are becoming more common. This means a problem for developers, since changing the aspect ratio of graphics can have an ugly effect on the content. There's no really nice way around this problem - all an application can do is to be aware of when the STB changes the aspect ratio of its output (by listening to changes in the Decoder Format Conversion) and react accordingly by changing assets to meet the new aspect ratio. This is an extreme approach, and most applications will simply use graphical content that been authored so that it doesn't look too terrible if it's viewed in the wrong aspect ratio.

Practicalities - taking steps to avoid these problems

Some of these issues can be solved by the comparatively simple step of taking care when you set up your graphical environment. If your application cares about its appearance, the most important thing is not to assume anything when setting up your graphics environment. Applications should specify every option that they need to when configuring HAVi graphics or video devices, but they should not specify constraints on parameters that they do not care about in order to avoid causing problems for other applications. Applications should also take care to check that the platform can match the configuration that they specify. If it can't, at least the application will know about the elements of the graphics and/or video configuration that will not be met.

Avoiding video positioning problems

As we've already seen, some hardware platforms have problems positioning video off the screen. One possible workaround for this is to position the video so that the displayed video area is not off the screen, but then to clip the source video to only display the desired section. For instance, the following code works on a decoder where the top left corner of the video may not be off-screen:

// This code assumes that we have a Player already and
// that we've got the AWTVideoSizeControl from that
// Player.  This control is stored in the variable
// myAWTVideoSizeControl.

// Get the size of the source video
Dimension sourceVideoSize =
  myAWTVideoSizeControl.getSourceVideoSize();

// Set the source rectangle to be a quarter of the video,
// based on the source video size.  Since we're taking
// the bottom right quarter of the video, the x/y
// co-ordinates are the same as the width/height
Rectangle sourceRect = new Rectangle (
        (sourceVideoSize.width/2),
        (sourceVideoSize.height/2),
        (sourceVideoSize.width/2),
        (sourceVideoSize.height/2));

// Set the destination rectangle to be the top quarter of
// the screen. Since we're clipping the video to
// quarter-screen, this will only reposition the video,
// not scale it (assuming the video is PAL resolution).
// Note that we position the video so that the top left
// corner is still on-screen
Rectangle destinationRect =
  new Rectangle(0, 0, 360, 280);

// Create the transform that we will use
AWTVideoSize newVS = new AWTVideoSize(
  sourceRect, destinationRect);

// Check we can support this...
AWTVideoSize resultingVS =
  myAWTVideoSizeControl.checkSize(newVS);

// Now set the size using the result we get from
// checkSize()
if (myAWTVideoSizeControl.setSize(resultingVS)){
    return true;
}

While this requires a slightly different approach to the problem, it successfully avoids the hardware limitations of the platform while still enabling a (relatively) simple solution.

Using HAVi effectively

The HAVi UI API adds a lot of extra support for the graphics model in MHP and OCAP receivers - support that's not present in standard AWT. This support allows the application to interrogate the receiver, and to find out whether the receiver can support the features desired by the application or at least to find out how close it can come. Of course, this is only useful if the application takes advantage of it, and so the application has to be flexible enough to support the different capabilities of the receivers in the marketplace.

While it can be a pain to add all of the extra checks to make sure that your application is displaying properly, this is really only a minor issue since these are largely the same for every application that you will develop. A larger task is to make sure that your application requests the features it actually needs, and no more. This is harder, since this may change for every application that you write, and it may even change through the development of the application. By doing this, however, other applications may be able to change the display configuration in a way that does not affect your application, but which may be important to them.

Careful graphics design

It's already necessary to take care about the graphical and interaction design of your application when authoring for a TV. Interlacing and other factors associated with TV displays mean that designers already have to take more care when developing graphics for TV-based applications. Adding this level of interactivity in an open platform means that considering issues such as aspect ratios will be next. Other steps include designing graphics that can degrade gracefully, e.g. by not relying on images in the background plane to display important information. The tutorial on designing user interfaces for a TV screen covers these issues in a little more detail.