Mainline Explicit Fencing – part 2

In the first post we covered the main concepts behind Explicit Synchronization for the Linux Kernel. Now in the second post of the series we are going to look to the Android Sync Framework, the first (out-of-tree) Explicit Fencing implementation for the Linux Kernel.

The Sync Framework was the Android solution to implement Explicit Fencing in AOSP. It uses file descriptors to communicate fencing information between userspace and kernel and between userspace process.

In the Sync Framework it all starts with the creation of a Sync Timeline, a struct created for each driver context to represent a monotonically increasing counter. It is the Sync Timeline who will guarantee the ordering between fences in the same Timeline. The driver contexts could be different GPU rings, or different Displays on your hardware.

Sync Timeline

Sync Timeline

Then we have Sync Points(sync_pt), the name Android gave to fences, they represent a specific value in the Sync Timeline. When created the Sync Point is initialized in the Active state, and when it signals, i.e., the job it was associated to finishes, it transits to the Signaled state and informs the Sync Timeline to update the value of the last signaled Sync Point.

Sync Point

Sync Point

To export and import Sync Points to/from userspace the Sync Fence struct is used. Under the hood the the Sync Fence is a Linux file and we use thte Sync Fence to store Sync Point information. To exported to userspace a unused file descriptor(fd) is associated to the Sync Fence file. Drivers can then use the file descriptor to pass the Sync Point information around.

Sync Fence

Sync Fence

The Sync Fence is usually created just after the Sync Point creation, it then travel through the pipeline, via userspace, until the driver that is going to wait for the Sync Fence to signal. The Sync Fence signal when the Sync Point inside it signals.

One of the most important features of the Android Sync Framework is the ability to merge Sync Fences into a new Sync Fence containing all Sync Points from both Sync Fences. It can contain as many Sync Points as your resource allows. A merged Sync Fence will only signal when all its Sync Points signals.

Sync Fence with Merged fences

Sync Fence with Merged Fences. Here we merge two Sync Points into one Sync File.

When it comes to userspace API the Sync Framework has implements three ioctl calls. The first one is to wait on sync_fence to signal. There is also a call to merge two sync_fences into a third and new sync_fence. And finally there is a also a call to grab information about the sync_fence and all its sync_points.

The Sync Fences fds are passed to/from the kernel in the calls to ask the kernel to render or display a buffer.

This was intended to be a overview of the Sync Framework as we will see some of these concepts on the next article where we will talk about the effort to add explict fencing on mainline kernel. If you want to learn more about the Sync Framework you can find more info here and here.

Collabora Contributions to Linux Kernel 4.8

Linux Kernel 4.8 is out and once more Collabora engineers did a significant contribution to the Kernel. For the 4.8 Collabora contributed 101 patches by 8 engineers, our record to date in single kernel release! We’ve also seen the first contribution from Frederic Dalleau since he joined Collabora. LWN.net covered the new features of the new kernel in three different posts, here, here and here.

On the Collabora side of the contributions we touched a few different areas in the kernel. Bob Ham, who recently left Collabora, added support for the Alea I Random Number Generator, while Enric Balletbo improved the audio support on the Rockchip rk3288 SoC. Frederic Dalleau fixed an important memory leak on the Bluetooth stack.

Gustavo Padovan continued his work add Explicit Synchronization for Buffer Sharing on the kernel. In this release he added fence_array support and prepared the SW_SYNC interfaces for de-staging, SW_SYNC meant to be used for Explict Syncronization testing. He also worked in removing some of the legacy functions from drm_irq.c from the kernel.

Helen Koike added some improvements and clean ups to the ASoC subsystem mainly on the max9877 and tpa6130a2 drivers. Nicolas Dufresne fixed the bytes per line calculation on YUV planes on the uvcvideo driver.

Thierry Escande added many improvements the NFC digital layer and Tomeu Vizoso added a new helper for the ChromeOS Embedded Controller and improved usage of DRM Core APIs on the Rockchip driver. He also fixed an issue with the Analogix DP on Rockchip that was not enabling clocks in the correct order.

Bob Ham (2):

Enric Balletbo i Serra (8):

Frederic Dalleau (1):

Gustavo Padovan (50):

Helen Koike (8):

Nicolas Dufresne (1):

Thierry Escande (26):

Tomeu Vizoso (5):

My talk about Mainline Explicit Fencing at XDC 2016!

Last week I was at XDC in Helsinki where I presented about the Explicit Fencing work we’ve been doing on the Mainline Linux Kernel in the lastest few months. There was a livestream of all presentations during the conference and recorded sections are available. You can check the video of my presentation. Check out the slides too.

If you want to check the code we’ve been writing they are available here:

Linux Kernel: https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences

Mesa: https://git.collabora.com/cgit/user/padovan/mesa.git/log/?h=fences

libdrm: https://git.collabora.com/cgit/user/padovan/libdrm.git/log/?h=fences

kmscube: https://github.com/robclark/kmscube/tree/atomic-fence

Soon we will get Explicit Fencing on Android’s drm_hwcomposer as well so expect updates on this blog with more information about that. :)

Also I would like to take the opportunity to thank Collabora for sponsoring my travel to XDC and Martin Peres for organizing such a great conference. It was my first time attending XDC and my time there was absolutely great, I  have learnt a lot about what the Graphics community have been doing lately and I met the people doing this work. I was happy to see a lot of interest from many people around the Explicit Fencing work we’ve doing.

 

Mainline Explicit Fencing – part 1

When it comes to buffer sharing synchronization in the kernel there are two ways of doing it: Implicit Fencing and Explicit Fencing. The difference between them relies on the fact that the kernel may or may not share synchronization information with userspace, it will either be implicit, with no fencing information provided, or explicit with all information available to userspace.

The fencing synchronization mechanism allows the sharing of buffers without the risk of a driver or userspace to read an incomplete buffer or write to a buffer that is still under use somewhere else in the system. The fencing provides ordering to these operations to make reads or writes happen only when the buffer is not used by other drivers anymore. For example,when a GPU job is queued a fence is associated to the buffer in the job, that fence can be used by other drivers for synchronization purposes, they won’t use the buffer a signal from the fence is received. The signal means the buffers is now free to be used. Similarly we can have the same setting for the GPU driver to wait the buffer to come out of the screen to render on it again.

The central piece here is the fence, an element that is attached to each buffer whenever a request involving the buffer is sent to the kernel. The fence can be used by userspace or other drivers to wait for the work to finish. So once the work is finished the fence signals and the waiter can proceed and do whatever they want with the buffer.

While Implicit Fencing  helps a lot with buffer synchronization there are a few cases where the whole desktop compositing could stall. Imagine the following compositor flow: there are 3 buffers to process, A, B and C. A and B are sent for rendering in parallel while C is going to be composed of both A and B. But the compositor will only be notified when both buffers are rendered thus if B takes too long the compositing of the whole desktop will be blocked waiting for B and C won’t be displayed in time.

A compositor processing two buffers in parallel

A compositor processing two buffers in parallel, with Implicit Fencing if B takes too long the desktop compositor freezes.

However with Explicit Fencing the compositor should have one fence for each buffer and will be notified when each buffer is rendered. So if A renders fast and B takes too long the compositor can decide not wait for B and proceed with the scanout of C with buffer A but an old version of B. The fencing information allows the compositor to be smart and take decisions to avoid the screen to freeze for example.

As of today the Linux Kernel only has generic APIs for Implicit Fencing, although some drivers have Explicit Fencing already their APIs are device specific. Android currently has its own implementation through the Android Sync Framework – which will be explained in the next article.

Explicit Fencing works on a Consumer-Producer fashion. In an GPU rendering + scanout to the screen pipeline it would synchronize between the kernel drivers, so when submitting a new rendering job to the GPU(Producer side) userspace would get back a fence related to that buffer submitted. That means userspace doesn’t need to block waiting for the job to complete, a signal is sent when the job is finished. As userspace doesn’t need to block it and has a fence of the buffer it then can proceed right away with the syscall to ask the display hardware(Consumer) to scanout the buffer that is yet to be processed. With explicit fencing the kernel is taught to wait for the fence to signal, before starting the scanout process.

A new fence is returned to userspace when the buffer is submitted to the kernel for scanout on the display hardware, that fence will signal when the buffer is not being displayed anymore, thus is ready for reuse by another rendering job. When the userspace gets this fence back it can submit a new rendering job to the GPU without waiting. The wait is done on the kernel side by the GPU driver, once the fence signals the rendering on that buffer can be initiated.

Explicit Fencing

The fence travels all the way to userspace and the next element on the pipeline. The yellow arrows represents the fences on userspace.

Last but not least, debugability of the graphics pipeline is improved. Having access to the fence in userspace helps a lot understanding what is happening in the pipeline. Previously, with Implicit Fencing there was no infomation available, so it was hard to figure out what was happening on the pipeline, also each vendor was trying to implement their own Implicit Fencing mechanism. Now with an standard Explicit Fencing mechanism it easier to build debug/tracing infrastructure that can be used to investigate issues in any system.

The next article will explain the Android Sync Framework and later the work on mainline to support explicit fencing will be described.

Slides for my LinuxCon talk on Mainline Explicit Fencing

For those of you that are interested here are the slides of the my presentation at LinuxCon North America this week. The conference was great with very good talks and very interesting meetings on the hallway track.

My presentation covered the effort to create the Explicit Fencing mechanism on the Linux Kernel which is to be used mainly by the Graphics pipeline. In short, Explicit Fencing is a way to give userspace information about the current state of shared buffers inside the kernel. This is done through fences, that can then be passed around to userspace and/or other kernel drivers for synchronization purposes. This allows both userspace and kernel to wait for kernel jobs to finish without blocking. It also significantly helps the compositor take more efficient and smart decisions on scheduling frames to display on the screen. I’ll be posting an article with more details on it soon. :)

Finally I would like to thank Collabora for sponsoring my travel to LinuxCon.

Collabora contributions to Linux Kernel 4.7

Linux Kernel 4.7 was released this week with a total of 36 contributions from five Collabora engineers. It includes the first contributions from Helen as Collaboran and the first ever contributions on the kernel from Robert Foss. Here are some of the highlights of the work Collabora have done on Linux Kernel 4.7.

Enric added support for the Analogix anx78xx DRM Bridge and fixed two SD Card related issues on OMAP igep00x0: fix remove/insert detection and enable support to read the write-protect pin.

Gustavo de-staged the sync_file framework (Android Sync framework) that will be used to add explicit fencing support to the graphics pipeline and started a work to clean up usage of legacy vblank helpers.

Helen Koike created a separated module for the V4L2 Test Pattern Generator and fixed return errors on the pipeline validation code while Robert Foss improved the DRM documentation and fixed drm/vc4 (Raspberry Pi) when there is already a pending update when calling atomic_commit.

Tomeu fixed two Rockchip issues while working on the intel-gpu-tools support for other platforms.

Enric Balletbo i Serra (6):

Gustavo Padovan (22):

Helen Koike (3):

Robert Foss (3):

Tomeu Vizoso (2):

Collabora contributions to Linux Kernel 4.6

Linux Kernel 4.6 was released this week, and a total of 9 Collabora engineers took part in its development, Collabora’s highest number of engineers contributing to a single Linux Kernel release yet. In total Collabora contributed 42 patches.

As part of Collabora’s continued commitment to further increase its participation to the Linux Kernel, Collabora is actively looking to expand its team of core software engineers. If you’d like to learn more, follow this link.

Here are some highlights of Collabora’s participation in Kernel 4.6:

Andrew Shadura fixed the number of buttons reported on the Pemount 6000 USB touchscreen controller, while Daniel Stone enabled BCM283x familiy devices in the ARM multi_v7_defconfig and Emilio López added module autoloading for a few sunxi devices.

Enric Balletbo i Serra added boot console output to AM335X(Sitara) and OMAP3-IGEP and fixed audio codec setup on AM335X using the right external clock. Martyn Welch added the USB device ID for the GE Healthcare cp210x serial device and renamed the reset reason of the Zodiac Watchdog.

Gustavo Padovan cleaned up the Android Sync Framework on the staging tree for further de-staging of the Sync File infrastructure, which will land in 4.7. Most of the work was removing interfaces that won’t be used in mainline. He also added vblank event support for atomic commits in the virtio DRM driver.

Peter Senna improved an error path and added some style fixes to the sisusbvga driver. While Sjoerd Simons enabled wireless on radxa Rock2 boards, fixed an issue withthe brcmfmac sdio driver sometimes timing out with a false positive and fixed some issues with Serial output on Renesas R-Car porter board.

Tomeu Vizoso changed driver_match_device() to return errors and in case of -EPROBE_DEFER queue the device for deferred probing, he also provided two fixes to Rockchip DRM driver as part of his work on making intel-gpu-tools work on other platforms.

Following is a list of all patches submitted by Collabora for this kernel release:

Andrew Shadura (1):

Daniel Stone (1):

Emilio López (4):

Enric Balletbo i Serra (3):

Gustavo Padovan (17):

Martyn Welch (2):

Peter Senna Tschudin (4):

Sjoerd Simons (6):

Tomeu Vizoso (4):

Collabora contributions to Linux Kernel 4.5

Linux Kernel 4.5 was released earlier this week, and once again Collabora engineers played a role in its development. In addition to their current projects, seven Collabora engineers contributed a total of 33 patches to the new Kernel.

As part of its continued committment to further increase ts participation to the Linux Kernel, Collabora is looking to expand its team of core software engineers. If you’d like to learn more, follow this link.

Here are some highlights of Collabora’s participation in Kernel 4.5:

Daniel Stone improved i915 runtime WARN() messages and fixed an important issue in the component subsystem when component_add() fails. Danilo Cesar made the DRM Docbook ready for Markdown text.

Gustavo Padovan improved the pm_runtime management on the drm/exynos driver and started work on de-staging the Android Sync Framework. On Rockchip, Sjoerd Simons enabled IR receiver to RK3288 Radxa Rock 2 Square, added multi_v7_defconfig for Rockchip audio and enabled RK3288 SPDIF clocks to change their parent. On the net side, Sjoerd added a patch to turn carrier off on phy attach to avoid unknown states and another patch to add ethernet0 alias for the RK3288 to help u-boot find this device-node.

During his brief time with us at Collabora, Heiko Stübner added the dts file for the veyron-brain board, a shutdown callback to platform variant dwc2 devices for a special clock handling to avoid getting stuck on the reboot/poweroff process and multi_v7_defconfig support to Rockchip’s io-domain driver, crypto module and rk808 clkout module. He also enabled support for veyron minnie touchscreen, adjusted temperature limits on veyron-speedy and fixed the edp-24m clock to be associated to the internal 24MHz oscillator all the time.

Martyn Welch added a driver for the Zodiac Aerospace RAVE Watchdog Processor, while Tomeu Vizoso added a device_is_bound() helper function and setter for dev.pm_domain that comes with extra checkings. Tomeu also added a patch to allow USB devices to remain runtime-suspended when sleeping and another patch to optimize sleep by going direct_complete if driver has no prepare and PM callbacks. Lastly, Tomeu also fixed a freq issue on Tegra devfreq_dev_profile.target callback.

Following is a list of all patches submitted by Collabora for this kernel release:

Daniel Stone (3):

Danilo Cesar Lemes de Paula (1):

Gustavo Padovan (9):

Heiko Stübner (8):

Martyn Welch (2):

Sjoerd Simons (5):

Tomeu Vizoso (5):

Collabora contributions to Linux Kernel 4.4

Linux Kernel 4.4 was released this week and Collabora engineers helped in the development of the new kernel in a few different areas. A total of 38 patches from 8 Collabora engineers were added, making it the kernel release with the most Collabora developers ever! Only 7 of the 8 engineers are still part Collabora however, as unfortunately Javier left a few months ago, after completing his patches.

On that note, Collabora is hiring experienced kernel hackers to further increase our participation in the Linux Kernel. If you are interested, please drop a line!

In this release Daniel Stone fixed a potential circular deadlock when loading the i915 GuC firmware and incorrect pipe paramenter on drm_crtc_send_vblank_event() that was leading to WARN_ON. Danilo Cesar Lemes de Paula improved the kernel-doc script to fix an issue with struct drm_modeset_lock not showing at the final kernel Doc and fixes a fault in the highlight processing by using arrays instead of hashes.

Emilio López enabled EC verified boot context on Peach Boards and driver to read/write nvram’s verified boot context to/from userspace for Chromebook devices and Enric Balletbo i Serra added support for TI’s tps65217 charger driver while Gustavo Padovan added cursor support on exynos DRM driver. Javier did some improvements to the Chromebook EC driver.

Sjoerd Simons added rockchip support by default on ARM multi_v7_defconfig and a driver for the SPDIF audio transceiver on rockchip boards. Tomeu Vizoso removed the regulator_list as it was redundant because the regulators devices can be found through the regulator_class, fixed an clk reparenting issue on exynos5250 that was preventing the screen to work after the second suspend.

A full list of all commits is provided here:

Daniel Stone (2):

Danilo Cesar Lemes de Paula (3):

Emilio López (3):

Enric Balletbo i Serra (3):

Gustavo Padovan (3):

Javier Martinez Canillas (3):

Sjoerd Simons (17):

Tomeu Vizoso (4):

Se comunicando com segurança na Internet

O Bloqueio do WhatsApp é bom momento pra gente pensar em segurança da informação.

Se tem um coisa que eu ficaria muito feliz se todas a pessoas entendessem é que a *internet não é um lugar seguro*. Uma analogia que a ajuda a entender: se comunicar online é como morar num quarto que não tem porta, as pessoas podem ir lá e ver qualquer coisa que você está fazendo em privado.

A alternativa ao WhatApp segura é o Signal. Feito por uma galera de fato compremetida com comunição segura. Tem app pra Android e iPhone. https://whispersystems.org/ (IMPORTANTE: O telegram não é seguro, existem algumas preocupações sobre vulnerabilidades no protocolo de criptografia)

Pra navegar na internet a opção é o Mozilla Firefox, um navegador que não deixa nada a desejar em relação ao Chrome por exemplo. https://www.mozilla.org/pt-BR/firefox/new/

Porém algumas extensões são necessárias pra de fato navegar com segurança. A Electronic Frontier Foundation (EFF) disponibiliza o HTTPS Everywhere (https://www.eff.org/Https-everywhere) e o Privacy Badger(https://www.eff.org/privacybadger). Juntos, eles garantem que sua comunicação com sites que você usa normalmente é segura e livre de espionagem seja de corporações, governos ou criminosos.

Além disso o AdBlockPlus(https://adblockplus.org/) é um adicional muito bom, bloqueando propagandas e trackers indesejados.

Todas essas ferramentas são software livre, isso significa que o código fonte desses software está disponível para consulta e contribuições de terceiros.

Do mesmo jeito que pensamos em segurança no dia-a-dia devemos pensar nela na internet. Você não deixa seu portão ou carro aberto pra que quiser entrar. Você não troca segredos com seu namorado(a) na frente de outras pressoas. Na internet é mesma coisa.

Quem quiser saber mais, ou trocar idéia disso, mande um alô.