Now here is a look at the changes made by Collaborans. To begin with Daniel Stone fixed an issue when waiting for fences on the i915 driver, while Emil Velikov added support to read the PCI revision for sysfs to improve the starting time in some applications.
Emilio López added a set of selftests for the Sync File Framework and Enric Balletbo i Serra added support for the ChromeOS Embedded Controller Sensor Hub. Fabien Lahoudere added support for the NVD9128 simple panel and enabled ULPI phy for USB on i.MX.
Gabriel Krisman fixed a spurious CARD_INT interrupts for SD cards that was preventing one of our kernelCI machines to boot. On the graphics side Gustavo Padovan added Explicit Synchronization support to DRM/KMS.
Martyn Welch added GPIO support for CP2105 USB serial device while Nicolas Dufresne fixed Exynos4 FIMC to roundup imagesize to row size for tiled formats, otherwise there would be enough space to fit the last row of the image. Last but not least, Tomeu Vizoso added debugfs interface to capture frames CRCs, which is quite helpful for debugging and automated graphics testing.
And now the complete list of Collabora contributions:
Daniel Stone (1):
Emil Velikov (1):
Emilio López (7):
Enric Balletbo i Serra (3):
Fabien Lahoudere (4):
Gabriel Krisman Bertazi (1):
Gustavo Padovan (18):
Martyn Welch (1):
Nicolas Dufresne (1):
Tomeu Vizoso (2):
The DRM implementation lays down on top of two kernel infrastructures, struct dma_fence, which represents the fence and struct sync file that provides the file descriptors to be shared with userspace (as it was discussed in the previous articles). With fencing the display infrastructure needs to wait for a signal on that fence before displaying the buffer on the screen. On a Explicit Fencing implementation that fence is sent from userspace to the kernel. The display infrastructure also sends back to userspace a fence, encapsulated in a struct sync_file, that will be signalled when the buffer is scanned out on the screen. The same process happens on the rendering side.
It is mandatory to use of Atomic Modesetting and here is not plan to support legacy APIs. The fence that DRM will wait on needs to be passed via the IN_FENCE_FD property for each DRM plane, that means it will receive one sync_file fd containing one or more dma_fence per plane. Remember that in DRM a plane directly relates to a framebuffer so one can also say that there is one sync_file per framebuffer.
On the other hand for the fences created by the kernel that are sent back to userspace the OUT_FENCE_PTR property is used. It is a DRM CRTC property because we only create one dma_fence per CRTC as all the buffers on it will be scanned out at the same time. The kernel sends this fence back to userspace by writing the fd number to the pointer provided in the OUT_FENCE_PTR property. Note that, unlike from what Android did, when the fence signals it means the previous buffer – the buffer removed from the screen – is free for reuse. On Android when the signal was raised it meant the current buffer was freed. However, the Android folks have patched SurfaceFlinger already to support the Mainline semantics when using Explicit Fencing!
Nonetheless, that is only one side of the equation and to have the full graphics pipeline running with Explicit Fencing we need to support it on the rendering side as well. As every rendering driver has its own userspace API we need to add Explicit Fencing support to every single driver there. The freedreno driver already has its Explicit Fencing support mainline and there is work in progress to add support to i915 and virtio_gpu.
On the userspace side Mesa already has support for the EGL_ANDROID_native_fence_sync needed to use Explicit Fencing on Android. Libdrm incorporated the headers to access the sync file IOCTL wrappers. On Android, libsync now has support for both the old Android Sync and Mainline Sinc File APIs. And finally, on drm_hwcomposer, patches to use Atomic Modesetting and Explicit Fencing are available but they are not upstream yet.
Validation tests for both Sync Files and fences on the Atomic API were written and added to IGT.]]>
For those who want to see an overall report of what was happened in the 4.9 kernel take a look on the always good LWN articles: part 1, 2 and 3.
As for Collabora contributions most of our work was in the DRM and DMABUF subsystems. Andrew Shadura and Daniel Stone added to fixes to the AMD and i915 drivers respectively. Emilio López added the missing install of sync_file.h uapi.
Gustavo Padovan advanced a few more steps on the goal to add explicit fencing to the DRM subsystem, besides a few improvements to Sync File and the virtio_gpu driver he also de-staged the SW_SYNC validation framework that helps with Sync File testing.
Peter Senna added drm_bridge support to imx-ldb device while Tomeu Vizoso improved drm_bridge support on RockChip’s analogic-dp and added documentation about validation of the DRM subsystem.
Outside of the Graphics world we had Enric Balletbo i Serra adding support to upload firmware on the ziirave watchdog device. Fabien Lahoudere and Martyn Welch enabled and improved DMA support for i.MX53 UARTs, allowing the device tree to decide whether DMA is used or not. Martyn also added a fake VMEbus (Versa Module Europa bus) to help with VME driver development.
On the Bluetooth, subsystem Frédéric Dalleau fixed an error code for SCO connections, that was causing big timeout and failures on SCO connections requests. Finally Robert Foss worked to clear the pipeline on errors for cdc-wdm USB devices.
Andrew Shadura (1):
Daniel Stone (1):
Emilio López (2):
Enric Balletbo i Serra (1):
Fabien Lahoudere (3):
Frédéric Dalleau (1):
Gustavo Padovan (14):
Martyn Welch (4):
Peter Senna Tschudin (1):
Robert Foss (2):
Tomeu Vizoso (7):
The Sync Framework was the Android solution to implement Explicit Fencing in AOSP. It uses file descriptors to communicate fencing information between userspace and kernel and between userspace process.
In the Sync Framework it all starts with the creation of a Sync Timeline, a struct created for each driver context to represent a monotonically increasing counter. It is the Sync Timeline who will guarantee the ordering between fences in the same Timeline. The driver contexts could be different GPU rings, or different Displays on your hardware.
Then we have Sync Points(sync_pt), the name Android gave to fences, they represent a specific value in the Sync Timeline. When created the Sync Point is initialized in the Active state, and when it signals, i.e., the job it was associated to finishes, it transits to the Signaled state and informs the Sync Timeline to update the value of the last signaled Sync Point.
To export and import Sync Points to/from userspace the Sync Fence struct is used. Under the hood the the Sync Fence is a Linux file and we use thte Sync Fence to store Sync Point information. To exported to userspace a unused file descriptor(fd) is associated to the Sync Fence file. Drivers can then use the file descriptor to pass the Sync Point information around.
The Sync Fence is usually created just after the Sync Point creation, it then travel through the pipeline, via userspace, until the driver that is going to wait for the Sync Fence to signal. The Sync Fence signal when the Sync Point inside it signals.
One of the most important features of the Android Sync Framework is the ability to merge Sync Fences into a new Sync Fence containing all Sync Points from both Sync Fences. It can contain as many Sync Points as your resource allows. A merged Sync Fence will only signal when all its Sync Points signals.
When it comes to userspace API the Sync Framework has implements three ioctl calls. The first one is to wait on sync_fence to signal. There is also a call to merge two sync_fences into a third and new sync_fence. And finally there is a also a call to grab information about the sync_fence and all its sync_points.
The Sync Fences fds are passed to/from the kernel in the calls to ask the kernel to render or display a buffer.
This was intended to be a overview of the Sync Framework as we will see some of these concepts on the next article where we will talk about the effort to add explict fencing on mainline kernel. If you want to learn more about the Sync Framework you can find more info here and here.]]>
On the Collabora side of the contributions we touched a few different areas in the kernel. Bob Ham, who recently left Collabora, added support for the Alea I Random Number Generator, while Enric Balletbo improved the audio support on the Rockchip rk3288 SoC. Frederic Dalleau fixed an important memory leak on the Bluetooth stack.
Gustavo Padovan continued his work add Explicit Synchronization for Buffer Sharing on the kernel. In this release he added fence_array support and prepared the SW_SYNC interfaces for de-staging, SW_SYNC meant to be used for Explict Syncronization testing. He also worked in removing some of the legacy functions from drm_irq.c from the kernel.
Helen Koike added some improvements and clean ups to the ASoC subsystem mainly on the max9877 and tpa6130a2 drivers. Nicolas Dufresne fixed the bytes per line calculation on YUV planes on the uvcvideo driver.
Thierry Escande added many improvements the NFC digital layer and Tomeu Vizoso added a new helper for the ChromeOS Embedded Controller and improved usage of DRM Core APIs on the Rockchip driver. He also fixed an issue with the Analogix DP on Rockchip that was not enabling clocks in the correct order.
Bob Ham (2):
Enric Balletbo i Serra (8):
Frederic Dalleau (1):
Gustavo Padovan (50):
Helen Koike (8):
Nicolas Dufresne (1):
Thierry Escande (26):
Tomeu Vizoso (5):
If you want to check the code we’ve been writing they are available here:
Linux Kernel: https://git.kernel.org/cgit/linux/kernel/git/padovan/linux.git/log/?h=fences
Soon we will get Explicit Fencing on Android’s drm_hwcomposer as well so expect updates on this blog with more information about that. :)
Also I would like to take the opportunity to thank Collabora for sponsoring my travel to XDC and Martin Peres for organizing such a great conference. It was my first time attending XDC and my time there was absolutely great, I have learnt a lot about what the Graphics community have been doing lately and I met the people doing this work. I was happy to see a lot of interest from many people around the Explicit Fencing work we’ve doing.
The fencing synchronization mechanism allows the sharing of buffers without the risk of a driver or userspace to read an incomplete buffer or write to a buffer that is still under use somewhere else in the system. The fencing provides ordering to these operations to make reads or writes happen only when the buffer is not used by other drivers anymore. For example,when a GPU job is queued a fence is associated to the buffer in the job, that fence can be used by other drivers for synchronization purposes, they won’t use the buffer a signal from the fence is received. The signal means the buffers is now free to be used. Similarly we can have the same setting for the GPU driver to wait the buffer to come out of the screen to render on it again.
The central piece here is the fence, an element that is attached to each buffer whenever a request involving the buffer is sent to the kernel. The fence can be used by userspace or other drivers to wait for the work to finish. So once the work is finished the fence signals and the waiter can proceed and do whatever they want with the buffer.
While Implicit Fencing helps a lot with buffer synchronization there are a few cases where the whole desktop compositing could stall. Imagine the following compositor flow: there are 3 buffers to process, A, B and C. A and B are sent for rendering in parallel while C is going to be composed of both A and B. But the compositor will only be notified when both buffers are rendered thus if B takes too long the compositing of the whole desktop will be blocked waiting for B and C won’t be displayed in time.
However with Explicit Fencing the compositor should have one fence for each buffer and will be notified when each buffer is rendered. So if A renders fast and B takes too long the compositor can decide not wait for B and proceed with the scanout of C with buffer A but an old version of B. The fencing information allows the compositor to be smart and take decisions to avoid the screen to freeze for example.
As of today the Linux Kernel only has generic APIs for Implicit Fencing, although some drivers have Explicit Fencing already their APIs are device specific. Android currently has its own implementation through the Android Sync Framework – which will be explained in the next article.
Explicit Fencing works on a Consumer-Producer fashion. In an GPU rendering + scanout to the screen pipeline it would synchronize between the kernel drivers, so when submitting a new rendering job to the GPU(Producer side) userspace would get back a fence related to that buffer submitted. That means userspace doesn’t need to block waiting for the job to complete, a signal is sent when the job is finished. As userspace doesn’t need to block it and has a fence of the buffer it then can proceed right away with the syscall to ask the display hardware(Consumer) to scanout the buffer that is yet to be processed. With explicit fencing the kernel is taught to wait for the fence to signal, before starting the scanout process.
A new fence is returned to userspace when the buffer is submitted to the kernel for scanout on the display hardware, that fence will signal when the buffer is not being displayed anymore, thus is ready for reuse by another rendering job. When the userspace gets this fence back it can submit a new rendering job to the GPU without waiting. The wait is done on the kernel side by the GPU driver, once the fence signals the rendering on that buffer can be initiated.
Last but not least, debugability of the graphics pipeline is improved. Having access to the fence in userspace helps a lot understanding what is happening in the pipeline. Previously, with Implicit Fencing there was no infomation available, so it was hard to figure out what was happening on the pipeline, also each vendor was trying to implement their own Implicit Fencing mechanism. Now with an standard Explicit Fencing mechanism it easier to build debug/tracing infrastructure that can be used to investigate issues in any system.
The next article will explain the Android Sync Framework and later the work on mainline to support explicit fencing will be described.]]>
My presentation covered the effort to create the Explicit Fencing mechanism on the Linux Kernel which is to be used mainly by the Graphics pipeline. In short, Explicit Fencing is a way to give userspace information about the current state of shared buffers inside the kernel. This is done through fences, that can then be passed around to userspace and/or other kernel drivers for synchronization purposes. This allows both userspace and kernel to wait for kernel jobs to finish without blocking. It also significantly helps the compositor take more efficient and smart decisions on scheduling frames to display on the screen. I’ll be posting an article with more details on it soon. :)
Finally I would like to thank Collabora for sponsoring my travel to LinuxCon.]]>
Enric added support for the Analogix anx78xx DRM Bridge and fixed two SD Card related issues on OMAP igep00x0: fix remove/insert detection and enable support to read the write-protect pin.
Gustavo de-staged the sync_file framework (Android Sync framework) that will be used to add explicit fencing support to the graphics pipeline and started a work to clean up usage of legacy vblank helpers.
Helen Koike created a separated module for the V4L2 Test Pattern Generator and fixed return errors on the pipeline validation code while Robert Foss improved the DRM documentation and fixed drm/vc4 (Raspberry Pi) when there is already a pending update when calling atomic_commit.
Tomeu fixed two Rockchip issues while working on the intel-gpu-tools support for other platforms.
Enric Balletbo i Serra (6):
Gustavo Padovan (22):
Helen Koike (3):
Robert Foss (3):
Tomeu Vizoso (2):
As part of Collabora’s continued commitment to further increase its participation to the Linux Kernel, Collabora is actively looking to expand its team of core software engineers. If you’d like to learn more, follow this link.
Here are some highlights of Collabora’s participation in Kernel 4.6:
Andrew Shadura fixed the number of buttons reported on the Pemount 6000 USB touchscreen controller, while Daniel Stone enabled BCM283x familiy devices in the ARM multi_v7_defconfig and Emilio López added module autoloading for a few sunxi devices.
Enric Balletbo i Serra added boot console output to AM335X(Sitara) and OMAP3-IGEP and fixed audio codec setup on AM335X using the right external clock. Martyn Welch added the USB device ID for the GE Healthcare cp210x serial device and renamed the reset reason of the Zodiac Watchdog.
Gustavo Padovan cleaned up the Android Sync Framework on the staging tree for further de-staging of the Sync File infrastructure, which will land in 4.7. Most of the work was removing interfaces that won’t be used in mainline. He also added vblank event support for atomic commits in the virtio DRM driver.
Peter Senna improved an error path and added some style fixes to the sisusbvga driver. While Sjoerd Simons enabled wireless on radxa Rock2 boards, fixed an issue withthe brcmfmac sdio driver sometimes timing out with a false positive and fixed some issues with Serial output on Renesas R-Car porter board.
Tomeu Vizoso changed driver_match_device() to return errors and in case of -EPROBE_DEFER queue the device for deferred probing, he also provided two fixes to Rockchip DRM driver as part of his work on making intel-gpu-tools work on other platforms.
Following is a list of all patches submitted by Collabora for this kernel release:
Andrew Shadura (1):
Daniel Stone (1):
Emilio López (4):
Enric Balletbo i Serra (3):
Gustavo Padovan (17):
Martyn Welch (2):
Peter Senna Tschudin (4):
Sjoerd Simons (6):
Tomeu Vizoso (4):