Planet VideoLAN

Welcome on Planet VideoLAN. This page gathers the blogs and feeds of VideoLAN's developers and contributors. As such, it doesn't necessarly represent the opinion of all the developers, the VideoLAN project, ...

VLC for Android

May 21, 2020

dav1d 0.7.0: mobile focus

Jean-Baptiste Kempf

tl;dr

Dav1d new release:

  • 10% faster on Intel CPUs with 25% less RAM, assembly finished for 8bit
  • ARM64 assembly mostly done for 10/12bit in addition to 8bit
  • dav1d is twice as fast as gav1 on ARM CPU and 4 times faster for 10b
  • 1080p AV1 decodable real-time with 2 little-core on Pixel 1

A few reminders about dav1d

dav1d cores

If you follow this blog, you should know everything about dav1d.

The VideoLAN, VLC and FFmpeg communities have been working on a new AV1 decoder, dav1d, in order to create the best and fastest decoder.

A new very fast release

0.7.0 is a major new release, whose focus is, once again, speed. It is doubly interesting, for improvements are important for both computers and smartphones.

The ref_mv rewrite

For once, the biggest speed improvement for desktop and laptops is not coming from writing more assembly code, but from Ronald's rewrite of the ref_mv algorithm.

This new algo gives a 8-12% speed improvement measured on Haswell machines while reducing memory usage by 25%.
We're talking about 10% faster for the complete AV1 decoding, that's a more important impact than a lot of assembly we wrote.

x86 Assembly

With 0.7.0 release, the assembly for x86 CPUs (32bit and 64bit) is now totally complete for the 8bit bitdepth.

We finished up all the small optimizations that remained for SSSE3 and AVX2, notably film grain, during the 0.6.0 and 0.7.0 development cycles. We added more AVX-512 assembly, for those with very recent CPUs.

In the future, getting faster on those Intel CPU is going to be very difficult (I know I said that already many times, but this time it's true).

Dav1d is still around 3x to 5x faster than aomdec on normal computers; but we are now even more faster :). See older posts for more information.

ARM Assembly

As for 0.6.0, an important focus of dav1d 0.7.0 was ARM assembly, and notably for the 10bitdepth cases.

As of 0.7.0, most assembly you should care about is done for 8bit/10bit/12bit on ARM64 and this makes decoding AV1 on the phones affordable.

ARM speed vs gav1

gav1 is an open source decoder made by Google to compete with dav1d on Android and ARM.

As of 0.7.0, dav1d is between 1.8x and 2.5x faster on 8b content and 2.4x to 5x faster on 10b content than gav1 on different CPUs.

dav1d vs gav1 this graph was made on ODroid N2, for example.

Deep dive on ARM cores and performance

ARM CPUs for mobile devices have an architecture with both LITTLE and big cores, which offer different speed and different power usage.

Using different types of cores allows to consume only the power you need for normal tasks, and be able to go in max power, when requested.

It is therefore extremely important to analyze the performance of our ARM code on both types of cores and when mixing it.

So let's see have a look at how dav1d and gav1 compare on the reference AV1 sample, made by Netflix, Chimera and on the SnapDragon 821 (Pixel 1 phone): dav1d cores dav1d cores

Learnings

What we can learn from those graphs are the following:

  • dav1d can decode this sample, in all the above configurations, starting from 2 threads
  • gav1 is never able to decode that sample at 24fps, in LITTLE, big and big.LITTLE configurations
  • threading in gav1 is catastrophic: the more threads you add, the less efficient the decoding is
  • threading in dav1d is quite good: it always increases the performance, when you add more threads
  • max performance is around 2.3x faster in dav1d than gav1

For 10b, the situation is even worse for gav1.

I want to emphasis on the fact that dav1d can decode Chimera with 2 threads on the Pixel 1, from 2016, using only the LITTLE cores.

Focus on LITTLE cores on Android

So, what's interesting is to look at the LITTLE cores performance on Android to see the actual speed of the decoder, under low-power cases.

We tested here, all the threads configuration, on the following Android devices:

  • Google Pixel 1 (SnapDragon 821) (2016)
  • Google Pixel 2 (SnapDragon 835) (2017)
  • Google Pixel 3 (SnapDragon 845) (2018)
  • Xiaomi Mi 9T Pro (SnapDragon 855) (2019)

Here are the results: dav1d cores

Once again, we can see, on LITTLE cores:

  • dav1d is always at least 2x faster than gav1
  • we still see the previously mentioned threading issues on gav1
  • dav1d can decode Chimera at 24fps starting with 2 threads on the LITTLE cores, gav1 cannot

AV1 10bit on LITTLE

For the sake of completeness, here are the results for 10b on the LITTLE cores: dav1d cores

You can find all the details here, in the spreadsheet done by Nathan.

Conclusion and Thanks

dav1d is now a very fast decoder on desktops, laptops, but mostly on mobile where it shows very impressive performance on 8b and 10b. It can decode 1080p with a couple of cores on mobile.

Thanks a lot to Nathan Egge, from Mozilla, who gathered all the data required for this post. He therefore did all the work for this blogpost.

May 21, 2020 03:36 PM

December 03, 2019

Announcing VLC 3.2

Geoffrey Métais

VLC 3.2 is the second feature update for VLC 3. This update focused on polishing the user interface, adding some more features to Android TV and chromebooks and improving network browsing.

Under the hood, the VLC app is now (almost) 100% written in kotlin!

VLC

UI redesign, including player and TV browsers

Android TV

The browsing on Android TV has been made easier and more efficient.

You can now browse your media, filter them, jump to headers in a brand new screen.

When listening music, a Now playing line now displays media currently played

Players

On phones and tablets, the video player interface has been modernized and the controls have been reorganized.

Video player

Video player playlist

The audio player design has also been improved.

Audio player

Audio player land

Playlists

The playlist now has redesigned covers and can be easily modified.

Playlists

Playlist edition

Equalizer

The equalizer UI has been completely reworked and is now more pleasant to see and to use!

Equalizer

Media info

The media information screen has been redesigned to be clearer.

Media info

Chromebook

We added the support for desktop multi-selection with ctrl and shift. We also implemented most of the VLC desktop player keyboard shortcuts.

The support for external devices has also been improved.

You can read more about this on the Android developer story

Group videos by name feature

The videos can be grouped by name or by folder allowing you to find the right video quicker. This implementation is quite basic right now, next feature update will improve it to have grid view for any king of videos grouping.

Video grouping

Voice search on TV

The Medialibrary content is now searchable from TV launcher.

Video grouping

Cleaner media titles

The media titles are cleaned up for a better readability. Medialibrary now removes files extension and common garbage keywords from video files title.

Improved SMBv2/3 support

The SMB support has been improved to provide a better network browsing experience. There should be no conflict between SMBv1 and v2/v3 anymore. And login process has been improved too.

Under the hood

LibVLC has been updated, to get latest fixes, security patches and performance enhancements.

The whole Android app is now 100% written in kotlin! (but the medialibrary and libvlc API bindings remain in Java, to not induce kotlin dependency)

We also improved the architecture by continuing our Android Arch components integration and by the use of kotlin coroutines.

December 03, 2019 12:00 AM