K230 GPU API reference#
Copyright 2023 Canaan Inc. ©
Disclaimer#
The products, services or features you purchase should be subject to Canaan Inc. (“Company”, hereinafter referred to as “Company”) and its affiliates are bound by the commercial contracts and terms and conditions of all or part of the products, services or features described in this document may not be covered by your purchase or use. Unless otherwise agreed in the contract, the Company does not provide any express or implied representations or warranties as to the correctness, reliability, completeness, merchantability, fitness for a particular purpose and non-infringement of any statements, information, or content in this document. Unless otherwise agreed, this document is intended as a guide for use only.
Due to product version upgrades or other reasons, the content of this document may be updated or modified from time to time without any notice.
Trademark Notice#
, “Canaan” and other Canaan trademarks are trademarks of Canaan Inc. and its affiliates. All other trademarks or registered trademarks that may be mentioned in this document are owned by their respective owners.
Copyright 2023 Canaan Inc.. © All Rights Reserved. Without the written permission of the company, no unit or individual may extract or copy part or all of the content of this document without authorization, and shall not disseminate it in any form.
Directory#
[TOC]
preface#
Overview#
This document describes the usage guide for the K230 chip 2.5D GPU module.
Reader#
This document (this guide) is intended primarily for:
Technical Support Engineer
Software Development Engineer
Definition of acronyms#
Abbreviation |
Full name |
---|---|
GPU |
Graphics Process Unit |
BLIT |
Bit Block Transfer |
CLUT |
Color Look Up Table |
Revision history#
Version version number |
Modify the description |
Author |
date |
---|---|---|---|
V1.0 |
Initial |
Huang Ziyi |
2023/04/06 |
V1.1 |
Adjust the formatting |
Huang Ziyi |
2023/05/06 |
1. Function introduction#
This module is mainly used to accelerate the drawing of vector graphics, which can be used to draw menus and other pages, etc., and supports accelerating some LVGL drawing. The GPU has a series of drawing instructions, write the drawing instructions to memory, submit the address and the total length of the instructions to the GPU, and start drawing. This module supports polygon, quadratic Bezier, cubic Bezier and ellipse fill drawing, linear gradient fill, color lookup table, image compositing and blending, and BLIT.
2. Data flow#
The GPU software driver section includes the device /dev/vg_lite and its kernel module driver vg_lite.ko, as well as the user-mode library libvg_lite.so, libvg_lite.so opens /dev/vg_lite the device and interacts with the kernel-state driver through ioctl() and mmap(). The kernel-state drive is mainly implemented by vg_lite_hal.c, vg_lite functions in .c perform actual register operations by calling functions in vg_lite_hal.c through vg_lite_kernel functions.
Note: The driver does not check the physical address in the drawing instruction, and calling the VGLite API can indirectly read and write all physical addresses of DDR memory.
3. Software Interface#
The software interface is detailed in the header file vg_lite.h in the package.
3.1 Main Types and Definitions#
3.1.1 Parameter types#
name |
Type definition |
meaning |
---|---|---|
Int32_t |
Int |
32-bit signed integer |
uint32_t |
Unsigned int |
32-bit unsigned integer |
VG_LITE_S8 |
enum vg_lite_format_t |
8-bit signed integer coordinates |
VG_LITE_S16 |
enum vg_lite_format_t |
16-bit signed integer coordinates |
VG_LITE_S32 |
enum vg_lite_format_t |
32-bit signed integer coordinates |
vg_lite_float_t |
float |
Single-precision floating-point number |
vg_lite_color_t |
uint32_t |
32-bit color values Color values specify the colors used in various functions. The color is formed with an 8-bit RGBA channel. The red channel is in the bottom 8 bits of the color value, followed by the green and blue channels. The alpha channel is in the top 8 bits of the color value. For the L8 target format, RGB colors are converted to L8 by using the default ITU-R BT.709 conversion rules. |
3.1.2 Error Type vg_lite_error_t#
enumerate |
description |
---|---|
VG_LITE_GENERIC_IO |
Unable to communicate with kernel driver |
VG_LITE_INVALID_ARGUMENT |
Illegal parameters |
VG_LITE_MULTI_THREAD_FAIL |
Multithreaded error |
VG_LITE_NO_CONTEXT |
No context error |
VG_LITE_NOT_SUPPORT |
Feature not supported |
VG_LITE_OUT_OF_MEMORY |
No allotable drive heap memory |
VG_LITE_OUT_OF_RESOURCES |
No system heap memory to allocate |
VG_LITE_SUCCESS |
succeed |
VG_LITE_TIMEOUT |
Timeout |
VG_LITE_ALREADY_EXISTS |
Object already exists |
VG_LITE_NOT_ALIGNED |
Data alignment error |
3.1.3 Feature Enumeration#
enumerate |
description |
---|---|
gcFEATURE_BIT_VG_BORDER_CULLING |
Border clipping |
gcFEATURE_BIT_VG_GLOBAL_ALPHA |
Global Alpha |
gcFEATURE_BIT_VG_IM_FASTCLEAR |
Fast purge |
gcFEATURE_BIT_VG_IM_INDEX_FORMAT |
Color index |
gcFEATURE_BIT_VG_PE_PREMULTIPLY |
Alpha channel premultiplication |
gcFEATURE_BIT_VG_RADIAL_GRADIENT |
Radial grayscale |
gcFEATURE_BIT_VG_RGBA2_FORMAT |
RGBA2222 format |
3.2 GPU control#
If not specifically stated, then before calling any API function, the application must initialize the GPU implicit (global) context by calling the vg_lite_init, which will populate the feature table, reset the fast purge buffer, reset the synthetic target buffer, and allocate command and subdivision buffers.
The GPU driver only supports one current context and one thread to issue commands to the GPU. GPU drivers do not support multiple concurrent contexts running simultaneously in multiple threads/processes because GPU kernel drivers do not support context switching. A GPU application can only use one context to issue commands to the GPU hardware at any one time. If a GPU application needs to switch contexts, it should callvg_lite_close to close the current context in the current thread, and then the vg_lite_init can be calledvg_lite_initto initialize a new context in the current thread or in another thread/process.
If there is no special description, all functions returnvg_lite_error_t.
3.2.1 Context initialization and control functions#
3.2.1.1 vg_lite_set_command_buffer_size#
description
This function is optional. If you need to modify the size of the command buffer, you must call it before vg_lite_init. The command buffer size defaults to 64KB, which does not mean that a frame of commands must be less than 64KB, when the buffer is full, the render will be submitted directly to empty the buffer, and a larger buffer means a lower frequency of render commits, which can reduce the overhead of system calls.
parameter
Parameter name |
description |
Input/output |
---|---|---|
size |
Command buffer length |
input |
3.2.1.2 vg_lite_init#
description
This function initializes the memory and data structures required for the GPU plot/fill function, allocating memory for command buffers and patch buffers of the specified size. The width and height of the insert buffer must be multiples of 16. Sliced windows can be specified based on the amount of memory available in the system and the required performance. A smaller window can have a smaller memory footprint, but may result in lower performance. The smallest window available for tessellation is 16x16. If the height or width is less than 0, no insert buffer is created and can be used in cases where there is only padding.
If this will be the first context to access the hardware, the hardware will be opened and initialized. If you need to initialize a new context, you must call vg_lite_close to close the current context. Otherwise, the vg_lite_init returns an error.
parameter
Parameter name |
description |
Input/output |
---|---|---|
tessellation_width |
Slice window width. Must be an integer multiple of 16, the minimum is 16, the maximum cannot be greater than the frame width, if 0 means that tessellation is not used, then the GPU will only run blit. |
input |
tessellation_height |
Insert window height. Must be an integer multiple of 16, the minimum is 16, the maximum cannot be greater than the frame width, if 0 means that tessellation is not used, then the GPU will only run blit. |
input |
3.2.1.3 vg_lite_close#
description
Deletes all resources and frees all memory previously initialized by vg_lite_init functions. If this is the only active context, it also automatically shuts down the hardware.
3.2.1.4 vg_lite_finish#
description
This function explicitly submits the command buffer to the GPU and waits for it to complete.
3.2.1.5 vg_lite_flush#
description
This function explicitly commits the command buffer to the GPU without waiting for it to complete.
3.2.2 Pixel buffers#
3.2.2.1 Memory alignment requirements#
3.2.2.1.1 Source image alignment requirements#
GPU hardware requires rasterized images to be a multiple of 16 pixels. This requirement applies to all image formats. Therefore, users need to pad arbitrary image widths into multiples of 16 pixels for the GPU hardware to work correctly.
The byte alignment requirements for pixels depend on the specific pixel format.
Image format |
Bits per pixel |
Alignment requirements |
Supported as a source image |
Supported as a target image |
---|---|---|---|---|
VG_LITE_INDEX1 |
1 |
8B |
be |
not |
VG_LITE_INDEX2 |
2 |
8B |
be |
not |
VG_LITE_INDEX4 |
4 |
8B |
be |
not |
VG_LITE_INDEX8 |
8 |
16B |
be |
not |
VG_LITE_A4 |
4 |
8B |
be |
not |
VG_LITE_A8 |
8 |
16B |
be |
be |
VG_LITE_L8 |
8 |
16B |
be |
be |
VG_LITE_ARGB2222 group |
8 |
16B |
be |
be |
VG_LITE_RGB565 group |
16 |
32B |
be |
be |
VG_LITE_ARGB1555 group |
16 |
32B |
be |
be |
VG_LITE_ARGB4444 group |
16 |
32B |
be |
be |
VG_LITE_ARGB8888/XRGB8888 group |
32 |
64B |
be |
be |
3.2.2.1.2 Render target buffer alignment requirements#
GPU hardware requires pixel buffers to be in multiples of 16 pixels. This requirement applies to all image formats. Therefore, the user needs to align arbitrary pixel buffer widths to multiples of 16 pixels for the GPU hardware to work correctly. The byte alignment requirements for pixels depend on the specific pixel format.
See Table 2: Image Source Alignment Summary later in this document.
The start address alignment requirements for pixel buffers depend on whether the buffer layout format is tiled or linear (vg_lite_buffer_layout_t enum).
- If the format is tiled (4x4 tile), the start address and stride need to be 64 bytes aligned.
If the format is linear, there is no alignment requirement for the start address and stride.
3.2.2.2 Pixel caching#
The GPU includes two fully associative caches. Each cache has 8 rows and 64 bytes each. In this case, a cache line can hold a 4x4 pixel tile or a 16x1 pixel row.
3.2.3 Enumeration Types#
3.2.3.1 vg_lite_buffer_format_t#
This enumeration type specifies the color format to use for the buffer.
Note: For a summary of the alignment requirements of image formats, see Memory alignment requirements after numeric descriptions.
vg_lite_buffer_format_t |
description |
Supported as a source |
Support as a target |
Alignment (bytes) |
---|---|---|---|---|
VG_LITE_ABGR8888 |
8bits per channel, alpha in lower 8bits |
be |
be |
64 |
VG_LITE_ARGB8888 |
be |
be |
64 |
|
VG_LITE_BGRA8888 |
be |
be |
64 |
|
VG_LITE_RGBA8888 |
be |
be |
64 |
|
VG_LITE_BGRX8888 |
be |
be |
64 |
|
VG_LITE_RGBX8888 |
be |
be |
64 |
|
VG_LITE_XBGR8888 |
be |
be |
64 |
|
VG_LITE_XRGB8888 |
be |
be |
64 |
|
VG_LITE_ABGR1555 |
be |
be |
32 |
|
VG_LITE_ARGB1555 |
be |
be |
32 |
|
VG_LITE_BGRA5551 |
be |
be |
32 |
|
VG_LITE_RGBA5551 |
be |
be |
32 |
|
VG_LITE_BGR565 |
be |
be |
32 |
|
VG_LITE_RGB565 |
be |
be |
32 |
|
VG_LITE_ABGR4444 |
be |
be |
32 |
|
VG_LITE_ARGB4444 |
be |
be |
32 |
|
VG_LITE_BGRA4444 |
be |
be |
32 |
|
VG_LITE_RGBA4444 |
be |
be |
32 |
|
VG_LITE_A4 |
4bits alpha, 无RGB |
be |
not |
8 |
VG_LITE_A8 |
8bits alpha, 无RGB |
be |
be |
16 |
VG_LITE_ABGR2222 |
be |
be |
16 |
|
VG_LITE_ARGB2222 |
be |
be |
16 |
|
VG_LITE_BGRA2222 |
be |
be |
16 |
|
VG_LITE_RGBA2222 |
be |
be |
16 |
|
VG_LITE_INDEX_1 |
1-bit index format |
be |
not |
8 |
VG_LITE_INDEX_2 |
2-bit index format |
be |
not |
8 |
VG_LITE_INDEX_4 |
4-bit index format |
be |
not |
8 |
VG_LITE_INDEX_8 |
8-bit index format |
be |
not |
8 |
3.2.3.2 vg_lite_buffer_image_mode_t#
Specifies how the image is rendered to the buffer.
enumerate |
description |
---|---|
VG_LITE_NORMAL_IMAGE_MODE |
An image drawn in blend mode |
VG_LITE_NONE_IMAGE_MODE |
Image input is ignored |
VG_LITE_MULTIPLY_IMAGE_MODE |
The image is multiplied by the drawing color |
3.2.3.3 vg_lite_buffer_layout_t#
Specifies the layout of buffer data in memory.
enumerate |
description |
---|---|
VG_LITE_LINEAR |
Linear (scanline) layout. Note: This layout has no alignment requirements for buffers. |
VG_LITE_TILED |
The data is organized into 4x4 pixel inserts. Note: For this layout, the start address and span of the buffer need to be 64 bytes aligned. |
3.2.3.4 vg_lite_buffer_transparency_mode_t#
Specifies the transparency mode of a buffer.
enumerate |
description |
---|---|
VG_LITE_IMAGE_OPAQUE |
Opaque images: All image pixels are copied to VG PE for rasterization. |
VG_LITE_IMAGE_TRANSPARENT |
Transparent images: Only non-transparent image pixels are copied into VG PE. Note: This mode only works if the image mode is VG_LITE_NORMAL_IMAGE_MODE or VG_LITE_MULTIPLY_IMAGE_MODE. |
3.2.4 Structures#
3.2.4.1 vg_lite_buffer_t#
This structure defines the buffer layout of the image or memory data used by the GPU.
field |
type |
description |
---|---|---|
width |
int32_t |
Buffer pixel width |
height |
int32_t |
Buffer pixel height |
stride |
int32_t |
The number of bytes in a row |
tiled |
vg_lite_buffer_layout_t |
Linear or tiled |
format |
vg_lite_buffer_format_t |
Color type |
handle |
void * |
Memory handle |
memory |
void * |
The mapped virtual address |
address |
uint32_t |
Physical address |
.yuv |
N/A |
N/A |
image_mode |
vg_lite_buffer_image_mode_t |
BLIT mode |
transparency_mode |
vg_lite_buffer_transparency_mode_t |
Transparent mode |
3.2.5 Functions#
3.2.5.1 vg_lite_allocate#
description
This function is used to allocate memory before using the buffer in the blit or draw function.
In order for the hardware to access some memory, such as the source image or destination buffer, it needs to be allocated first. The providedvg_lite_buffer_tstructure needs to be initialized with the size (width and height) and format of the requested buffer. If stride is set to 0, this function will fill it in. The only input parameter to this function is a pointer to the buffer structure. If the structure has all the required information, appropriate memory is allocated for the buffer.
This function will call the kernel to actually allocate memory. Memoryhandles, logical addresses, and hardware addresses in vg_lite_buffer_t structures are populated by the kernel.
parameter
vg_lite_buffer_t *buffer: A pointer to the buffer that holds the size and format of the allocated buffer.
3.2.5.2 vg_lite_free#
description
This function is used to cancel a previously allocated buffer and free the memory of the buffer.
parameter
vg_lite_buffer_t *buffer: A pointer to the structure of the buffer being filled by the vg_lite_allocate.
3.2.5.3 vg_lite_buffer_upload#
description
This function uploads pixel data to GPU memory. Note that the format of the data (pixels) to be uploaded must be the same as described in the buffer object. The input data memory buffer should contain enough data to upload to the GPU buffer pointed to by the input parameter “buffer”.
parameter
vg_lite_buffer_t *buffer: A pointer to the structure of the buffer being filled by the vg_lite_allocate.
uint8_t *data[3]: A pointer to pixel data.
uint32_t stride[3]: The row span of pixel data.
3.2.5.4 vg_lite_map#
description
This function is used to map memory appropriately for a particular buffer. It will be used to properly translate the physical address of the buffer required by the GPU.
If you want to use a frame buffer directly as the destination buffer, you need to wrap it with avg_lite_buffer_tstructure and call the kernel to map the provided logical or physical address into hardware-accessible memory. For example, if you know the logical address of the frame buffer, setthe memory field of the vg_lite_buffer_t structure with this addressand call this function. If you know the physical address, set the memory field to NULL, and then program the address field with the physical address.
parameter
vg_lite_buffer_t *buffer: A pointer to the structure of the buffer being filled by the vg_lite_allocate.
3.2.5.5 vg_lite_unmap#
description
This function unmaps the buffer and frees any memory resources allocated by the previous call vg_lite_map.
parameter
vg_lite_buffer_t *buffer: A pointer to the structure of the buffer being filled by the vg_lite_map.
3.2.5.6 vg_lite_set_CLUT#
description
This function sets the color lookup table (CLUT) for indexed color images in context. Once CLUT is set (not empty), the image pixel color used for index format image rendering will be obtained from the color lookup table (CLUT) based on the color index value of the pixel.
parameter
uint32_t count: The count of colors in the color lookup table.
For INDEX_1, there can be up to 2 colors in the table.
For INDEX_2, there can be up to 4 colors in the table.
For INDEX_4, there can be up to 16 colors in the table.
For INDEX_8, there can be up to 256 colors in the table.
uint32_t *colors: The color lookup table (CLUT) pointed to by the pointer will be stored in context and programmed into the command buffer if needed. CLUT does not take effect until the command buffer is committed to hardware. The color is in ARGB format, with A in the highest position.
Note: The driver does not validate CLUT content from the application.
3.3 Matrix#
3.3.1 Structs#
3.3.1.1 vg_lite_matrix_t#
description
A 3x3 floating-point number matrix is defined.
3.3.2 Functions#
3.3.2.1 vg_lite_identity#
description
Set the matrix as the identity matrix.
parameter
vg_lite_matrix_t *matrix: A pointer to the vg_lite_matrix_tstructure that will be loaded into the identity matrix.
3.3.2.2 vg_lite_perspective#
description
Perform a perspective transformation on the matrix.
parameter
vg_lite_float_t px: Perspective transformation matrix.
vg_lite_float_t py: of the perspective transformation matrix
vg_lite_matrix_t *matrix: A pointer to the vg_lite_matrix_t structure that will be transformed by perspective3.3.1.1 vg_lite_matrix_t.
3.3.2.3 vg_lite_rotate#
description
Performs a rotational transformation of the matrix at a specified angle.
parameter
vg_lite_float_t degrees: The number of degrees of rotation of the matrix (in the system of angles), with positive numbers representing clockwise rotation.
vg_lite_matrix_t *matrix: A pointer to the vg_lite_matrix_t structure that will be rotated3.3.1.1 vg_lite_matrix_t.
3.3.2.4 vg_lite_scale#
description
Scale the matrix vertically and horizontally.
parameter
vg_lite_float_t scale_x: Horizontal scaling factor.
vg_lite_float_t scale_y: Vertical scaling factor.
vg_lite_matrix_t *matrix: A pointer to the vg_lite_matrix_t structure that will be scaled3.3.1.1 vg_lite_matrix_t.
3.3.2.5 vg_lite_translate#
description
Transform the matrix by translation.
parameter
vg_lite_float_t x: Horizontal translation.
vg_lite_float_t y: Vertical translation.
vg_lite_matrix_t *matrix: A pointer to the vg_lite_matrix_t structure that will be translated3.3.1.1 vg_lite_matrix_t.
3.4 BLITs for synthesis and mixing#
3.4.1 Enumeration Types#
3.4.1.1 vg_lite_blend_t#
This enumeration defines some of the mixed modes supported by VGLite API functions. S and D represent the source and destination color channels, and Sa and Da represent the source and destination alpha channels.
Colors are displayed with 100% and 50% opacity.
vg_lite_blend_t |
description |
---|---|
VG_LITE_BLEND_ADDITIVE |
S+D |
VG_LITE_BLEND_DST_IN |
Sa*D |
VG_LITE_BLEND_DST_OVER |
(1-Da)*S+D |
VG_LITE_BLEND_MULTIPLY |
S*(1-Da)+D*(1-Sa)+S*D |
VG_LITE_BLEND_NONE |
S |
VG_LITE_BLEND_SCREEN |
S+D-S*D |
VG_LITE_BLEND_SRC_IN |
Da*S |
VG_LITE_BLEND_SRC_OVER |
S+(1-Sa)*D |
VG_LITE_BLEND_SUBTRACT |
D*(1-Sa) |
3.4.1.2 vg_lite_filter_t#
Specify the sample filtering mode in blit and draw APIs.
vg_lite_filter_t |
description |
---|---|
VG_LITE_FILTER_POINT |
Only the most recent image pixels are taken. |
VG_LITE_FILTER_LINEAR |
Use linear interpolation along horizontal lines. |
VG_LITE_FILTER_BI_LINEAR |
Use a 2x2 box around the image pixels and interpolate. |
3.4.1.3 vg_lite_global_alpha_t#
Specifies the global alpha mode in blit APIs.
vg_lite_global_alpha_t |
description |
---|---|
VG_LITE_NORMAL |
Use the original src/dst alpha value. |
VG_LITE_GLOBAL |
Replace the original src/dst alpha value with the global src/dst alpha value. |
VG_LITE_SCALED |
Multiply the global src/dst alpha value with the original src/dst alpha value. |
3.4.2 Structures#
3.4.2.1 vg_lite_rectangle_t#
The structure defines the organization of a rectangle of data.
field |
type |
description |
---|---|---|
x |
int32_t |
x coordinate, with the upper left corner as the origin |
and |
int32_t |
Y coordinates, upper left corner is the origin |
width |
int32_t |
width |
height |
int32_t |
height |
3.4.2.2 vg_lite_point_t#
This structure defines a 2D point.
field |
type |
description |
---|---|---|
x |
int32_t |
X coordinate |
and |
int32_t |
Y coordinates |
3.4.2.3 vg_lite_point4_t#
This structure defines four two-dimensional points that make up the polygon. These points are defined by structural vg_lite_point_t.
field |
type |
description |
---|---|---|
vg_lite_point[4] |
int32_t each |
Array of 4 points |
3.4.3 Functions#
3.4.3.1 vg_lite_blit#
description
BLIT functions. BLIT operations are performed through a source buffer and a destination buffer. The source and destination buffer structures are defined with vg_lite_buffer_t structures. BLIT copies the source image to the destination buffer with a specified matrix, which can include translation, rotation, scaling, and perspective correction. Note that overlay sampling antialiasing is not supported vg_lite_buffer_t, so the edges of the destination buffer may not be smooth, especially the rotation matrix. Path rendering can be used to achieve high-quality overlay sample anti-aliasing (16X, 4X) rendering.
parameter
vg_lite_buffer_t *target: Points to the vg_lite_buffer_t struct that defines the destination buffer. For more information about the valid target color formats for the BLIT function, see Source Image Alignment Requirements.
vg_lite_buffer_t *source: A vg_lite_buffer_t struct that points to the source buffer. All color formats in vg_lite_buffer_format_tenumeration are valid source formats for the blit function.
vg_lite_matrix_t *matrix: Points to a vg_lite_matrix_t structure that defines a 3x3 transformation matrix from source pixels to destination pixels. If the matrix is NULL, an identity matrix is assumed, which means that the source pixel will be copied directly to the (0,0) position of the destination pixel.
vg_lite_blend_t blend: Specifies one of the hardware-supported blending modes to apply to each image pixel. If blending is not required, set this value to VG_LITE_BLEND_NONE. Note: If the “matrix” parameter is specified as rotation or perspective, and the “blend” parameter is specified as VG_LITE_BLEND_NONE, VG_LITE_BLEND_SRC_IN, or VG_LITE_BLEND_DST_IN, the driver overrides the application’s settings for BLIT operations and the transparency mode is always set to TRANSPARENT. This is due to some limitations of GPU hardware.
vg_lite_color_t color: If not zero, this color value is used as the blend color. Before blending occurs, blending colors are multiplied by each source pixel. If you don’t need to blend colors, set the color parameter to 0.
vg_lite_filter_t filter: Specifies the type of filtering mode. All formats in the vg_lite_filter_t enumeration are valid formats for this function.
3.4.3.2 vg_lite_blit_rect#
description
BLIT rectangle function.
parameter
vg_lite_buffer_t *target: 参考vg_lite_blit。
vg_lite_buffer_t *source: 参考vg_lite_blit。
uint32_t *rect: Specifies the rectangular area to BLIT.
vg_lite_matrix_t *matrix: Reference vg_lite_blit.
vg_lite_blend_t *blend: Reference vg_lite_blit.
vg_lite_color_t color: Reference vg_lite_blit.
vg_lite_filter_t filter: 参考vg_lite_blit。
3.4.3.3 vg_lite_get_transform_matrix#
description
This function obtains a 3x3 homogeneous transformation matrix from the source and target coordinates.
parameter
vg_lite_point4_t src: A pointer to a set of four two-dimensional points that make up the source polygon.
vg_lite_point4_t dst: A pointer to a set of four two-dimensional points that make up the target polygon.
vg_lite_matrix_t *mat: Output parameter pointing to a 3x3 homogeneous matrix that converts the source polygon to the target polygon.
3.4.3.4 vg_lite_clear#
description
This function performs a clear operation, clearing/filling the specified pixel buffer (the entire buffer or part of the rectangle in the buffer) with an explicit color.
parameter
vg_lite_buffer_t *target: 参考vg_lite_blit。
vg_lite_rectangle_t *rectangle: A pointer to the vg_lite_rectangle_t structure that specifies the area to fill. If the rectangle is NULL, the entire destination buffer is filled with the specified color.
vg_lite_color_t color: The color of the fill, as specified in the vg_lite_color_t enumeration, which is the color value used to fill the buffer. If the buffer is in L8 format, the RGBA color will be converted to a luminance value.
3.4.4 Global Alpha Function#
3.4.4.1 vg_lite_set_image_global_alpha#
description
This function will set the image/source global alpha and return a status error code.
parameter
vg_lite_global_alpha_t alpha_mode: 全局Alpha模式,参见vg_lite_global_alpha_t.
uint32_t alpha_value: The Image/Source global alpha value to set.
3.4.4.2 vg_lite_set_dest_global_alpha#
description
This function will set the target global alpha and return a status error code.
parameter
vg_lite_global_alpha_t alpha_mode: 全局Alpha模式,参见vg_lite_global_alpha_t.
uint32_t alpha_value: The target global alpha value to set.
3.5 Vector Path Control#
3.5.1 Enumeration Types#
3.5.1.1 vg_lite_quality_t#
Specifies the level of hardware-assisted antialiasing.
vg_lite_quality_t |
description |
---|---|
VG_LITE_HIGH |
High quality: 16x coverage sampling anti-aliasing |
VG_LITE_UPPER |
High quality: 8x coverage sampling anti-aliasing (removed from June 2020). |
VG_LITE_MEDIUM |
Medium quality: 4x coverage sampling anti-aliasing |
VG_LITE_LOW |
Low quality: No anti-aliasing |
3.5.2 Structs#
3.5.2.1 vg_lite_hw_memory#
This structure simply records the kernel’s memory allocation information.
field |
type |
description |
---|---|---|
handle |
void * |
GPU memory object handle |
memory |
void * |
Logical address |
address |
uint32_t |
GPU memory address |
bytes |
uint32_t |
size |
property |
uint32_t |
The 0th bit is used to indicate the path upload. 0: Disables path data upload (always embedded in command buffer). 1: Enable automatic path data upload. |
3.5.2.2 vg_lite_path_t#
The structure describes the vector path data.
Path data consists of opcodes and coordinates. The format of opcodes is always VG_LITE_S8. For details on opcodes, refer to the section on vector path opcodes in this document.
field |
type |
description |
---|---|---|
bounding_box[4] |
vg_lite_float_t |
The bounding box of the path. [0] Left [1] Top [2] Right [3] Bottom |
quality |
vg_lite_quality_t |
Enumeration type of path quality, anti-aliasing level |
format |
enum vg_lite_format_t |
The enumeration type of the coordinate format |
uploaded |
vg_lite_hw_memory_t |
A structure with path data that has been uploaded to GPU-addressable memory |
path_length |
int32_t |
The number of bytes of the path |
path |
void * |
Path data pointer |
path_changed |
int32_t |
0: unchanged; 1: Change. |
About the alignment of coordinate formats with data.
vg_lite_format_t |
Path data alignment |
---|---|
VG_LITE_S8 |
8 bit |
VG_LITE_S16 |
2 bytes |
VG_LITE_S32 |
4 bytes |
3.5.2.2.1 Special description of path objects#
The end order has no effect because it is boundary-aligned.
Multiple consecutive opcodes should be packaged at the size of the specified data format. For example, for VG_LITE_S16 packed in 2 bytes, for VG_LITE_S32 packed in 4 bytes. Since opcodes are 8 bits (1 byte), for 16-bit (2-byte) or 32-bit (4-byte) data types:
…
<opcode1_that_needs_data >, the opcode is 8 bits.
<align_to_data_size> (alignment data).
< data for opcode1><align_to_data_size><data_for_opcode1>
<opcode2_that_doesnt_need_data> <opcode2_that_doesnt_need_data>
<opcode3_that_needs_data></p><p><p>
<align_to_data_size
<data_for_opcode3> <data_for_opcode3>
…
The path data in the array should always be 1, 2, or 4 bytes aligned, depending on the format. For example, for a 32-bit (4-byte) data type.
…
<opcode1_that_needs_data ><opcode1_that_needs_data>?
<pad to 4 bytes
<4 byte data_for_opcode1> (4 bytes).
<opcode2_that_doesnt_need_data> <opcode2_that_doesnt_need_data>
<opcode3_that_needs_data>
<pad to 4 bytes
<4 bytes data_for_opcode3>
…
For float types, using the IEEE754 encoding specification, the opcode is still an 8-bit signed integer and may require special handling by the software.
3.5.3 Functions#
3.5.3.1 vg_lite_path_calc_length#
description
This function calculates the buffer length (in bytes) of the path command. The application can allocate a buffer to use as a command buffer based on the buffer length calculated by this function.
parameter
uint8_t *cmd: A pointer to an array of opcodes used to build the path.
uint32_t count: The number of opcodes.
vg_lite_format_t format: Coordinate data format, all formats available for vg_lite_format_t are valid formats for this function.
3.5.3.2 vg_lite_path_append#
description
This function assembles the command buffer for the path, and this function makes the final GPU command for the path based on the input opcode (cmd) and coordinates (data).
parameter
vg_lite_path_t *path: A pointer to the vg_lite_path_t struct with the path definition.
uint8_t *cmd: A pointer to an array of opcodes used to build the path.
void *data: A pointer to the coordinate data array used to build the path.
uint32_t seg_count: Number of opcodes.
3.5.3.3 vg_lite_init_path#
description
This function initializes a path definition with the specified value.
parameter
vg_lite_path_t *path: A pointer to the vg_lite_path_t structure of the path object to initialize with the specified member value.
vg_lite_format_t data_format: Coordinate data format. All formats in the vg_lite_format_t enumeration are valid formats for the function.
vg_lite_quality_t quality: The quality of the path object. All formats in the vg_lite_quality_t enumeration are valid formats for this function.
uint32_t path_length: The length of the path data in bytes.
void *path_data: A pointer to path data.
vg_lite_float_t min_x, vg_lite_float_t min_y, vg_lite_float_t max_x, vg_lite_float_t max_y: Minimum and maximum x and y values that specify the bounding box of the path.
3.5.3.4 vg_lite_init_arc_path#
description
This function initializes an arc path definition with the specified value.
parameter
vg_lite_path_t *path: Reference vg_lite_init_path.
vg_lite_format_t data_format: vg_lite_init_path reference.
vg_lite_quality_t quality: 参考vg_lite_init_path。
uint32_t path_length: vg_lite_init_path reference.
void *path_data: Reference vg_lite_init_path.
vg_lite_float_t min_x, vg_lite_float_t min_y, vg_lite_float_t max_x, vg_lite_float_t max_y: vg_lite_init_path reference.
3.5.3.5 vg_lite_upload_path#
description
This function is used to upload the path to GPU memory.
Under normal circumstances, the GPU driver copies any path data into a command buffer structure during runtime. If there are a lot of paths to render, this does take some time. In addition, in embedded systems, path data usually does not change, so it makes sense to upload path data to GPU memory in a form that can be directly accessed by the GPU. This function will prompt the driver to allocate a buffer that will contain the path data and the data for the head and foot of the required command buffer for the GPU to access this data directly. When the path is finished, call vg_lite_clear_path to free the buffer.
parameter
vg_lite_path_t *path: A pointer to vg_lite_path_t struct that contains the path to upload.
3.5.3.6 vg_lite_clear_path#
description
This function will clear and reset the path member values. If the path has already been uploaded, it will free the GPU memory allocated when the path was uploaded.
parameter
vg_lite_path_t *path: A pointer to the vg_lite_path_t path definition to clear.
3.5.4 Vector Path Opcodes#
The following opcodes are path drawing commands that can be used for vector path data.
A path operation is submitted to the GPU in the form of [opcode|coordinates]. Operation codes are stored in the form of VG_LITE_S8, while coordinates are specified by vg_lite_format_t.
Opcode |
parameter |
description |
---|---|---|
0x00 |
None |
END. Finish, closing all open paths. |
0x02 |
(x,y) |
MOVE. Move to a given vertex. Close all open paths. = = |
0x03 |
(Δx,Δy) |
MOVE_REL. Move to a given relative point. Close all open paths. =+Δ =+Δ |
0x04 |
(x,y) |
LINE. Draw a straight line to a given vertex. (,,,) = = |
0x05 |
(Δx,Δy) |
LINE_REL. Draw a straight line to the vertices at opposite positions. =+Δ =+Δ (,,,) = = |
0x06 |
(cx,cy) (x,y) |
QUAD. Draws a quadric to a given endpoint using the specified control points. (,,,,,) = = |
0x07 |
(Δcx,Δcy) (Δx,Δy) |
QUAD_REL. Draws a quadric to the endpoint of a given relative position using the specified relative control point. = +Δ =+Δ = +Δ =+Δ (,,,,,) = = |
0x08 |
(cx1,cy1) (cx2,cy2) (x,y) |
CUBIC. Draws a cubic curve to a given endpoint using the specified control points. (,,1,1,2,2,,) = = |
0x09 |
(Δcx1,Δcy1) (Δcx2,Δcy2) (Δx,Δy) |
CUBIC_REL. Draws a cubic curve to the endpoint of a given relative position using the specified control point. 1= +Δ1 1=+Δ1 2=+Δ2 2=+Δ2 =+Δ =+Δ (,,1,1,2,2,,) = = |
0x0A |
(rh,rv,rot,x,y) |
SCCWARC. Draw a small CCW arc to the given endpoint using the specified radius and rotation angle. (,,,,) = = |
0x0B |
(rh,rv,rot,x,y) |
SCCWARC_REL. Draw a small CCW arc towards the given opposite endpoints using the specified radius and rotation angle. =+Δ =+Δ (ℎ,,,,) = = |
0x0C |
(rh,rv,rot,x,y) |
SCWARC. Draw a small CW arc to the given endpoint using the specified radius and rotation angle. (ℎ,,,,) = = |
0x0D |
(rh,rv,rot,x,y) |
SCWARC_REL. Draw a small CW arc to a given opposite endpoint using the specified radius and rotation angle. =+Δ =+Δ (ℎ,,,,) = = |
0x0E |
(rh,rv,rot,x,y) |
LCCWARC. Draw a large CCW arc towards a given endpoint using the specified radius and rotation angle. (ℎ,,,,) = = |
0x0F |
(rh,rv,rot,x,y) |
LCCWARC_REL. Draw a large CCW arc towards a given opposite endpoint using the specified radius and rotation angle. =+Δ =+Δ (ℎ,,,,) = = |
0x10 |
(rh,rv,rot,x,y) |
LCWARC. Draw a large CW arc to the given endpoint using the specified radius and rotation angle. (ℎ,,,,) = = |
0x11 |
(rh,rv,rot,x,y) |
LCWARC_REL. Draw a large CW arc to a given relative endpoint using the specified radius and rotation angle. =+Δ =+Δ (ℎ,,,,) = = |
3.6 Vector-based drawing operations#
3.6.1 Enumeration Types#
3.6.1.1 vg_lite_fill_t#
This enumeration is used to specify the padding rule to use. For drawing any path, the hardware supports non-zero and parity fill rules.
To determine whether any point is contained within an object, imagine drawing a line from that point in any direction to infinity so that the line does not cross any vertices of the path. For each edge crossed by the line, add 1 to the counter if the edge crosses from left to right, as seen by an observer walking from the line to infinity; subtract 1 if the edge crosses from right to left. This way, each area of the plane gets an integer value.
The non-zero fill rule says that if the resulting sum is not equal to zero, then a point is within the shape. The even/odd rule says that if the sum of the results is odd, then a point is inside the shape, regardless of the sign.
vg_lite_fill_t |
description |
---|---|
VG_LITE_FILL_NON_ZERO |
Non-zero fill rules. If a pixel intersects at least one path pixel, it is drawn. |
VG_LITE_FILL_EVEN_ODD |
Even fill rules. If a pixel intersects an odd number of path pixels, it is drawn. |
3.6.1.2 vg_lite_pattern_mode_t#
Defines how areas outside the image pattern are filled onto the path.
vg_lite_pattern_mode_t |
description |
---|---|
VG_LITE_PATTERN_COLOR |
Fills the outside of the pattern by color. |
VG_LITE_PATTERN_PAD |
The color of the pattern border is expanded to fill the area outside the pattern. |
3.6.2 Structures#
v3.6.2.1 g_lite_color_ramp_t#
This structure defines the end point of the radial gradient. Five parameters provide the offset and color of the stop. Each stop is defined by a set of floating-point values that specify the offset as well as the sRGBA color and alpha values. The values of the color channel exist as non-multiplying (R, G, B, ALPHA) quadrilaterals. All parameters are in the range of [0,1]. Red, green, blue, α values [0,1] are mapped to 8-bit pixel values [0,255].
The maximum number of defined radial gradient stops is MAX_COLOR_RAMP_STOPS, which is 256.
field |
type |
description |
---|---|---|
stop |
vg_lite_float_t |
The offset of the color stop |
red |
vg_lite_float_t |
The offset of the red stop |
green |
vg_lite_float_t |
The offset of the green stop |
blue |
vg_lite_float_t |
The offset of the blue stop |
alpha |
vg_lite_float_t |
The offset of the alpha channel stop |
3.6.2.2 vg_lite_linear_gradient_t#
This structure defines the organization of linear gradients in VGLite data. A linear gradient is applied to fill the path. It will generate a 256x1 image depending on the settings.
field |
type |
description |
---|---|---|
colors[VLC_MAX_GRAD] |
uint32_t |
An array of colors for the gradient |
count |
uint32_t |
Number of colors |
stops[VLC_MAX_GRAD] |
uint32_t |
Number of color levels, from 0 to 255 |
matrix |
vg_lite_matrix_t |
A matrix structure that converts the gradient color slope |
image |
vg_lite_buffer_t |
An image object structure that represents the color slope. |
The VLC_MAX_GRAD is the largest of 16, VLC_GRADBUFFER_WIDTH the maximum is 256.
3.6.3 Functions#
3.6.3.1 vg_lite_draw#
description
Perform hardware-accelerated 2D vector drawing operations.
The size of the insert buffer can be specified at initialization time, and the size will be adjusted by the kernel to the minimum alignment required by the hardware. If you make the insert buffer smaller, less memory will be allocated, but a path may be delivered to the hardware multiple times, because the hardware will walk the target with the provided insert window size, so performance may be degraded. It is good practice to set the size of the insert buffer to the most common path size. For example, if all you do is render fonts up to 24pt, you can set the insert buffer to 24x24.
parameter
vg_lite_buffer_t *target: A pointer to the vg_lite_buffer_t struct of the destination buffer. All color formats in vg_lite_buffer_format_t enumerations are valid target formats for drawing functions.
vg_lite_path_t *path: A pointer to vg_lite_path_t structure that contains data that describes the path to draw. For details on opcodes, refer to the section on vector path opcodes in this file.
vg_lite_fill_t fill_rule: Specifies the enumeration value of the vg_lite_fill_t for the path’s fill rule.
vg_lite_matrix_t *matrix: A pointer to vg_lite_matrix_t structure that defines the affine transformation matrix of the path. If the matrix is NULL, an identity matrix is assumed. Note: Nonaffine transformations are not supported vg_lite_draw, so the perspective transformation matrix has no effect on the path.
vg_lite_blend_t Blend: Selects a hardware-supported blend mode in the vg_lite_blend_t enumeration to apply to each drawn pixel. If blending is not required, set this value to VG_LITE_BLEND_NONE, which is 0.
vg_lite_color_t color: The color applied to each pixel drawn by the path.
3.6.3.2 vg_lite_draw_gradient#
description
This function is used to fill the path with a gradient color according to the specified fill rule. The specified path is transformed according to the selected matrix and filled with a gradient.
parameter
vg_lite_buffer_t *target: 参照vg_lite_draw.
vg_lite_path_t *path: See vg_lite_draw.
vg_lite_fill_t fill_rule: See vg_lite_draw.
vg_lite_matrix_t *matrix: See vg_lite_draw.
vg_lite_linear_gradient_t *grad: A pointer to vg_lite_linear_gradient_t struct that contains values to fill the path.
vg_lite_blend_t blend: See vg_lite_draw.
3.6.3.3 vg_lite_draw_pattern#
description
This function fills a path with an image pattern. The path is transformed according to the specified matrix and filled with the converted image pattern.
parameter
vg_lite_buffer_t *target: 参照vg_lite_draw.
vg_lite_path_t *path: See vg_lite_draw.
vg_lite_fill_t fill_rule: See vg_lite_draw.
vg_lite_matrix_t *matrix0: A pointer to the vg_lite_matrix_t struct, which defines the 3x3 transformation matrix of the path. If the matrix is NULL, an identity matrix is assumed.
vg_lite_buffer_t *source: A pointer to the vg_lite_buffer_t structure that describes the source of the image pattern.
vg_lite_matrix_t *matrix1: A pointer to a vg_lite_matrix_t structure that defines a 3x3 transformation from source pixel to destination pixel. If the matrix is NULL, an identity matrix is assumed, which means that the source pixel will be copied directly to the 0,0 position of the destination pixel.
vg_lite_blend_t blend: See vg_lite_draw.
vg_lite_pattern_mode_t pattern mode: Specifies vg_lite_pattern_mode_t value that defines how to fill the area outside the image pattern.
vg_lite_color_t pattern_color: Specify a 32bpp ARGB color (vg_lite_color_t) that is applied to the fill outside the image pattern area when the pattern_mode value is VG_LITE_PATTERN_COLOR.
vg_lite_filter_t filter: Specifies the type of filter. All formats in the vg_lite_filter_t enumeration are valid formats for this function. A value of zero (0) indicates VG_LITE_FILTER_POINT.
3.6.4 Linear gradient initialization and control functions#
3.6.4.1 vg_lite_init_grad#
description
This function initializes the internal buffer of the linear gradient object with default settings for rendering.
parameter
vg_lite_linear_gradient_t *grad: A pointer to vg_lite_linear_gradient_t structure that defines the gradient to be initialized. Use the default values.
3.6.4.2 vg_lite_set_grad#
description
This function is used to set the value of a vg_lite_linear_gradient_t structure member.
Note: In cases where the input parameters are incomplete or invalid, the default gradient color is set using the following rules.
If no valid stop is specified (for example, due to an empty input array, out-of-range, or out-of-order check), a stop of 0 with (R, G, B, α) color (0.0, 0.0, 0.0, 1.0) (opaque black) and a stop 1 with color (1.0, 1.0, 1.0) (opaque white) are implicitly defined.
If at least one valid stop is specified, but no stop with offset 0 is defined, an implied stop is added with an offset of 0 and the same color as the first user-defined stop.
If at least one valid stop is specified, but no stop with an offset of 1 is defined, an implicit stop is added with an offset of 1 and the same color as the last user-defined stop.
parameter
vg_lite_linear_gradient_t *grad: A pointer to the vg_lite_linear_gradient_t struct to set.
uint32_t count: The count of colors in a linear gradient. The maximum patch count is defined by the VLC_MAX_GRAD, which is 16.
uint32_t *colors: Specifies the color array of the gradient, ARGB8888 format, with alpha in the highest position.
uint32_t *stops: A pointer to the offset at which the gradient stops.
3.6.4.3 vg_lite_update_grad#
description
This function is used to update or generate the value of an image object that will be rendered. vg_lite_linear_gradient_t object has an image buffer that renders the gradient pattern. The image buffer is created or updated with the appropriate gradient parameters.
parameter
vg_lite_linear_gradient_t *grad: A pointer to the vg_lite_linear_gradient_t struct that contains the updated values to use to render the object.
3.6.4.4 vg_lite_get_grad_matrix#
description
This function is used to get a pointer to the transformation matrix of the gradient object, which allows the application to manipulate the matrix to facilitate the correct rendering of the gradient path.
parameter
vg_lite_linear_gradient_t *grad: A pointer to vg_lite_linear_gradient_t struct that contains the matrix to retrieve.
3.6.4.5 vg_lite_clear_grad#
description
This function is used to clear the value of a linear gradient object, freeing memory in the image buffer.
parameter
vg_lite_linear_gradient_t *grad: A pointer to the vg_lite_linear_gradient_t structure to clean.
4. Constraints#
The GPU driver only supports one current context and one thread to issue commands to the GPU. GPU drivers do not support multiple concurrent contexts running simultaneously in multiple threads/processes because GPU kernel drivers do not support context switching. A GPU application can only use one context to issue commands to the GPU hardware at any one time. If a GPU application needs to switch contexts, it should callvg_lite_close to close the current context in the current thread, and then the vg_lite_init can be called to3.2.1.2 vg_lite_initinitialize a new context in the current thread or in another thread/process.
5. Performance recommendations and best practices#
5.1 Cache vs Non-cache#
When loading the vg_lite.ko module, you can configure whether to turn off cache (cache is enabled by default) through the cache parameter, the driver will refresh the CPU cache (D-cache and L2-cache) before submitting the command to the GPU hardware in cache mode, and set the command buffer and graphics memory to non-cache when mapped to the user-mode address space in non-cache mode. In general, cache mode has better performance, and the CPU is faster to write command buffer and read graphics memory.
At present, flushing the cache uses the dcache.cipa directive, which can only flash one cache line at a time, that is, 64Bytes, for larger buffers such as 1920x1080@ARGB32, it takes about 129600 loops, most of which are missed, which will bring some performance overhead, and if you use the dcache.ciall directive you can flush all caches, but may affect other processes so they are not used.
5.2 Memory Usage#
vg_lite.ko modules take up about 130KB of memory after loading, and hardly take up more space except command buffer and video memory, and each memory allocated by the vg_lite_hal_allocate_contiguous requires corresponding page table resources and a node size of 64B in addition to the memory itself.
5.3 Drawing Process#
Use the VGLite API to operate
Callvg_lite_initinitialize the GPU
If GPU memory is required, callvg_lite_allocateallocation and mapping
Draw using the API
Callvg_lite_finishorvg_lite_flushcommit command to render and wait for the drawing to finish
Before the process ends, the callingvg_lite_closeshuts down the GPU
6. Examples#
The K230 SDK includes several GPU examples, the source code is placed in src/little/buildroot-ext/package/vg_lite/test/samples
, open BR2_PACKAGE_VG_LITE_DEMOS in buildroot to build and add to the generated system image (open by default).
6.1 tiger#
This is an example of drawing a tiger image, which is generated in the current directory when run tiger.png
.
6.2 linear degrees#
This is an example of an image with a linear gradient that, when run, is generated in the current directory linearGrad.png
.
6.3 imgIndex#
This is an example that uses a color lookup table that, when run, generates , and in the current directory imgIndex1.png``imgIndex2.png
imgIndex4.png
, imgIndex8.png
each with a different number of color indexes.
6.4 vglite_drm#
Here is an example of GPU + DRM display linkage,
Note: Since the video output driver under Linux does notincludeinitialization, you need to make sure that the screen is already in use before running, for example, it can be run on a big coresample_vo.elf 3
.
6.5 vglite_cube#
This is an example of a GPU + DRM display linkage, drawing a constantly rotating cube on the screen.
Note: Since the video output driver under Linux does notincludeinitialization, you need to make sure that the screen is already in use before running, for example, it can be run on a big coresample_vo.elf 3
.