A guide to Alexa Video Skills and why they are the future of TV and Voice experiences

FX Digital's guide to Alexa Video Skills takes a look at how Amazon could shape the future of our viewing and voice experience...
by Ramsey Marwan, 5th March 2020

What is a Video Skill?

Amazon’s strong backing of its voice technology, Alexa, has shown no signs of abating in recent times.  Through the creation of the Video Skill API and its dissemination to the wider development community, Amazon have opened up vast opportunities for developers to create experiences that “enable the far-field control of video devices and streaming services using an Alexa device”. The creation of Video Skills allows Smart Home users to simply use their voice to ask for content that they love, instead of having to pick up the TV remote or console controller and searching and scrolling endlessly through tiles of TV shows and movies.

What is Video Skills Kit and why is it giving Amazon a big advantage in the Voice Assistant market?

Voice Skills Kit (VSK) is a set of Alexa-specific schemas and APIs that let developers build Video Skills for Alexa-enabled devices. This is a massive advantage to Amazon in their ongoing battle with Google for the voice market. Google do not currently offer anything akin to Video Skills and with Amazon releasing this functionality before them, this may prove limiting to Google in the future as the industry progresses. Google have to rely on brands uploading video content to YouTube to make it accessible and playable through voice. On the other hand, Amazon has now enabled media firms access to their ecosystem through APIs or applying video schema tags. This means that brands do not need to upload their content elsewhere for it to be discoverable in the Amazon ecosystem.

What is the difference between a Video Skill and a Custom Skill for Video?

By using the Video Skill API, Alexa can determine which devices and services the user has and which top-titles are available for them to watch. As any Alexa user will know, using custom skills requires the user to invoke the skill on a specific device and the user can only interact with Alexa on that device, however, in the case of Video Skills, individual Echo Dots can be used to control a Fire TV stick or set-top box in the same room.

For example, a user could search for content by saying “Alexa, play Guardians of the Galaxy” and Alexa would surface all the Video Skills that offer the Guardians of the Galaxy movie. They can then choose whether they would like to purchase the movie from the Prime Video Skill or play it for free from the ‘FXDigitalMovies’ skill.

Compared to third-party custom skills, the user doesn’t have to enable the skill first for the skill to be returned in the search results. This makes discoverability less of an issue for Video Skills. Users don’t have to search for content by their top-title. Instead users can ask to play a certain genre or content that stars a particular actor. Alexa will then deliver the most relevant results, making searching and discovering content easier for the user.

The challenges of building a Video Skill

Building a Video Skill can prove more challenging to develop than a Voice Skill because of the requirements regarding certification. Like regular skills, Video Skills require submission to Amazon, where they undergo rigorous automated and in-person testing. If their performance is deemed inadequate, the Video Skill will not be approved and therefore will not be available on Alexa.

Video Skills are more susceptible to performance issues due to the sheer complexity of building them. For example loading the homepage of a video skill calls an API sequentially three times, and a slow performing back-end can vastly slow the performance of the Video Skill. FX Digital’s expert team of developers have found solutions to issues like these that ensure our Video Skills run at optimum performance and gain certification. 

What are the benefits of Video Skills?

Video Skills can enable users to control their viewing experience with their voice. For example, when a user asks Alexa for a movie name or TV show, the Alexa service will find the most appropriate Video Skill which offers that content, surfacing it to the user. Furthermore, Video Skills enable voice control for a variety of features, such as play/pause next episode, volume control and turning a device on/off. 

It is also important to bear in mind that this technology is still in its infancy. Amazon are constantly introducing new features for Video Skills. For example, a new ‘Live’ tab in the Echo Show should soon be able to display TV channels that are broadcasting right now. As Amazon continues to enrich the features of Video Skills, content providers will be able to offer an improved user experience. 

If a provider already has a Video Skill, FX Digital has created APIs that are easily reusable, making transferring content from Alexa to Fire TV a seamless process. Due to the way FX Digital creates Video Skills, using internal APIs and content-caches, serving content to both Video Skills and Fire TV simultaneously becomes a seamless process.

From a development perspective there are similarities between creating apps for Fire TV and building Video Skills. This makes it quicker and easier for FX Digital to create both a Fire TV App and Video Skill simultaneously or in phases.

Why do Video Skills win in the end?

As Video Skills improve, they will become a more suitable platform for content providers when compared to YouTube. This is because with Video Skills, the provider can control what ads are served.

Furthermore, Amazon Alexa accounts can be linked to existing user accounts in providers’ own databases, which will allow for a more consistent experience. 

As more set top boxes are incorporating Video Skill, the market for Video Skills will grow and if the user experience can be refined, consumer demand for Video Skills should also increase. This means more media brands will need to create Video Skills to ensure that customers can consume their content across Alexa.