5.2 Ai2d Module API Manual#
1. Overview#
This manual aims to guide developers in using MicroPython to develop AI Demos, construct preprocessing workflows, and implement functions to configure and execute preprocessing on input images using nncase_runtime.ai2d
. The module encapsulates the preprocessing methods supported by ai2d
and provides methods to construct and run the preprocessing process.
2. API Introduction#
2.1 init#
Description
Ai2d constructor.
Syntax
from libs.AI2D import Ai2d
my_ai2d = Ai2d(debug_mode=0)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
debug_mode |
Debug timing mode, 0 for timing, 1 for no timing, int type |
Input |
Default is 0 |
Return Value
Return Value |
Description |
---|---|
Ai2d |
Ai2d instance |
2.2 set_ai2d_dtype#
Description
Set the data type and format for ai2d preprocessing input and output.
Syntax
import ulab.numpy as np
my_ai2d.set_ai2d_dtype(input_format=nn.ai2d_format.NCHW_FMT, output_format=nn.ai2d_format.NCHW_FMT, input_type=np.uint8, output_type=np.uint8)
my_ai2d.set_ai2d_dtype(nn.ai2d_format.RGB_packed, nn.ai2d_format.NCHW_FMT, np.uint8, np.uint8)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
input_format |
Preprocessing input data format |
Input |
Required, determined by |
output_format |
Preprocessing output data format |
Input |
Required, determined by |
input_type |
Preprocessing input data type |
Input |
Required, choose |
output_type |
Preprocessing output data type |
Input |
Required, choose |
Return Value
Return Value |
Description |
---|---|
None |
2.3 crop#
Description
Crop preprocessing configuration method.
Syntax
my_ai2d.crop(0, 0, 200, 300)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
start_x |
Starting pixel in the width direction, int type |
Input |
Required |
start_y |
Starting pixel in the height direction, int type |
Input |
Required |
width |
Crop length in the width direction, int type |
Input |
Required |
height |
Crop length in the height direction, int type |
Input |
Required |
Return Value
Return Value |
Description |
---|---|
None |
2.4 shift#
Description
Shift preprocessing configuration method.
Syntax
my_ai2d.shift(shift_val=2)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
shift_val |
Number of bits to shift right, int type |
Input |
Required |
Return Value
Return Value |
Description |
---|---|
None |
2.5 pad#
Description
Padding preprocessing configuration method.
Syntax
my_ai2d.pad(paddings=[0, 0, 0, 0, 5, 5, 15, 15], pad_mode=0, pad_val=[114, 114, 114])
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
paddings |
List type, size of padding on both sides of each dimension for a 4D image (NCHW), this parameter contains 8 values representing the padding size for both sides of the N/C/H/W dimensions, usually only padding is done on the last two dimensions |
Input |
Required |
pad_mode |
Only supports constant padding, set to 0 |
Input |
Required |
pad_val |
List type, three-channel values filled at each pixel position, e.g., [114, 114, 114], [0, 0, 0] |
Input |
Required |
Return Value
Return Value |
Description |
---|---|
None |
2.6 resize#
Description
Resize preprocessing configuration method.
Syntax
my_ai2d.resize(interp_method=nn.interp_method.tf_bilinear, interp_mode=nn.interp_mode.half_pixel)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
interp_method |
Resize interpolation method |
Input |
Required, determined by |
interp_mode |
Resize mode |
Input |
Required, determined by |
Return Value
Return Value |
Description |
---|---|
None |
2.7 affine#
Description
Affine preprocessing configuration method.
Syntax
affine_matrix = [0.2159457, -0.031286, -59.5312, 0.031286, 0.2159457, -35.30719]
my_ai2d.affine(interp_method=nn.interp_method.cv2_bilinear, cord_round=0, bound_ind=0, bound_val=127, bound_smooth=1, M=affine_matrix)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
interp_method |
Resize interpolation method |
Input |
Required |
cord_round |
Coordinate rounding method, 0 or 1, uint32_t type |
Input |
Required, usually set to 0 |
bound_ind |
Boundary pixel processing mode, 0 or 1, uint32_t type |
Input |
Required, usually set to 0 |
bound_val |
Boundary fill value, uint32_t type |
Input |
Required, set to 127 |
bound_smooth |
Boundary smoothing, 0 or 1, uint32_t type |
Input |
Required, set to 1 |
M |
Affine transformation matrix corresponding vector, list obtained from 2x3 matrix transformation, see example above |
Input |
Required |
Return Value
Return Value |
Description |
---|---|
None |
2.8 build#
Description
Construct the preprocessor according to the configured preprocessing methods.
Syntax
my_ai2d = Ai2d(debug_mode=0)
my_ai2d.resize(nn.interp_method.tf_bilinear, nn.interp_mode.half_pixel)
my_ai2d.build([1, 3, 512, 512], [1, 3, 640, 640])
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
ai2d_input_shape |
ai2d input data shape |
Input |
Required |
ai2d_output_shape |
ai2d output data shape |
Input |
Required |
Return Value
Return Value |
Description |
---|---|
None |
2.9 run#
Description
Execute the preprocessing process using the configured ai2d preprocessor, returning nncase_runtime.tensor
.
Syntax
ai2d_output_tensor = my_ai2d.run(img)
Parameters
Parameter Name |
Description |
Input / Output |
Note |
---|---|---|---|
input_np |
Preprocessing input data, |
Input |
Required |
Return Value
Return Value |
Description |
---|---|
ai2d_output_tensor |
Data after ai2d preprocessing |
3. Data Structure Description#
3.1 type#
Input Format |
Output Format |
Note |
---|---|---|
YUV420_NV12 |
RGB_planar/YUV420_NV12 |
|
YUV420_NV21 |
RGB_planar/YUV420_NV21 |
|
YUV420_I420 |
RGB_planar/YUV420_I420 |
|
YUV400 |
YUV400 |
|
NCHW(RGB_planar) |
NCHW(RGB_planar) |
|
RGB_packed |
RGB_planar/RGB_packed |
|
RAW16 |
RAW16/8 |
Depth map, perform shift operation |
3.2 interp_method#
Interpolation methods in the resize preprocessing method. Listed as follows:
Method |
Description |
Note |
---|---|---|
nn.interp_method.tf_nearest |
tf’s nearest neighbor interpolation |
|
nn.interp_method.tf_bilinear |
tf’s bilinear interpolation |
|
nn.interp_method.cv2_nearest |
cv2’s nearest neighbor interpolation |
|
nn.interp_method.cv2_bilinear |
cv2’s bilinear interpolation |
3.3 interp_mode#
Mode |
Description |
Note |
---|---|---|
nn.interp_mode.none |
No special alignment strategy |
|
nn.interp_mode.align_corner |
Corner forced alignment |
|
nn.interp_mode.half_pixel |
Center alignment |
4. Example Program#
Attention
(1) Affine and Resize functions are mutually exclusive and cannot be enabled simultaneously;
(2) The input format for the Shift function can only be Raw16;
(3) Pad value is configured per channel, and the number of elements in the corresponding list must equal the number of channels;
(4) When multiple functions are configured, the execution order is Crop->Shift->Resize/Affine->Pad. Ensure that configuration parameters match this order. If not, multiple Ai2d instances need to be initialized to implement the preprocessing process;
Below is an example program:
from libs.PipeLine import PipeLine, ScopedTiming
from libs.AI2D import Ai2d
from media.media import *
import nncase_runtime as nn
import gc
import sys, os
if __name__ == "__main__":
# Display mode, default is "hdmi", can choose "hdmi" or "lcd"
display_mode = "hdmi"
if display_mode == "hdmi":
display_size = [1920, 1080]
else:
display_size = [800, 480]
# Initialize PipeLine for image processing workflow
pl = PipeLine(rgb888p_size=[512, 512], display_size=display_size, display_mode=display_mode)
pl.create() # Create PipeLine instance
my_ai2d = Ai2d(debug_mode=0) # Initialize Ai2d instance
# Configure resize preprocessing method
my_ai2d.resize(nn.interp_method.tf_bilinear, nn.interp_mode.half_pixel)
# Construct preprocessing process
my_ai2d.build([1, 3, 512, 512], [1, 3, 640, 640])
try:
while True:
os.exitpoint() # Check for exit signal
with ScopedTiming("total", 1):
img = pl.get_frame() # Get current frame data
print(img.shape) # Original image shape is [1, 3, 512, 512]
ai2d_output_tensor = my_ai2d.run(img) # Execute resize preprocessing
ai2d_output_np = ai2d_output_tensor.to_numpy() # Type conversion
print(ai2d_output_np.shape) # Preprocessed shape is [1, 3, 640, 640]
gc.collect() # Garbage collection
except Exception as e:
sys.print_exception(e) # Print exception information
finally:
pl.destroy() # Destroy PipeLine instance
In the above code, the resize preprocessing method is defined, with a preprocessing input resolution of (512, 512) and a preprocessing output resolution of (640, 640).