2022-08-19

iOS Vision: Drawing Detected Rectangles on Live Camera Preview Works on iPhone But Not on iPad

I'm using the iOS Vision framework to detect rectangles in real-time with the camera on an iPhone and it works well. The live preview displays a moving yellow rectangle around the detected shape.

However, when the same code is run on an iPad, the yellow rectangle tracks accurately along the X axis, but on the Y it is always slightly offset from the centre and it is not correctly scaled. The included image shows both devices tracking the same test square to better illustrate. In both cases, after I capture the image and plot the rectangle on the full camera frame (1920 x 1080), everything looks fine. It's just the live preview on the iPad that does not track properly.

I believe the issue is caused by how the iPad screen has a 4:3 aspect ratio. The iPhone's full screen preview scales its 1920 x 1080 raw frame down to 414 x 718, where both X and Y dims are scaled down by the same factor (about 2.6). However, the iPad scales the 1920 x 1080 frame down to 810 x 964, which warps the image and causes the error along the Y axis.

A rough solution could be to set a preview layer size smaller than the full screen and have it be scaled down uniformly in a 16:9 ratio matching 1920 x 1080, but I would prefer to use the full screen. Has anyone here come across this issue and found a transform that can properly translate and scale the rect observation onto the iPad screen?

Example test images and code snippet are below.

Comparison of detected rect on iPhone / iPad

let rect: VNRectangleObservation

//Camera preview (live) image dimensions
let previewWidth = self.previewLayer!.bounds.width
let previewHeight = self.previewLayer!.bounds.height

//Dimensions of raw captured frames from the camera (1920 x 1080)
let frameWidth = self.frame!.width
let frameHeight = self.frame!.height

//Transform to change detected rectangle from Vision framework's coordinate system to SwiftUI
let transform = CGAffineTransform(scaleX: 1, y: -1).translatedBy(x: 0, y: -(previewHeight))
let scale = CGAffineTransform.identity.scaledBy(x: previewWidth, y: previewHeight)

//Convert the detected rectangle from normalized [0, 1] coordinates with bottom left origin to SwiftUI top left origin
//and scale the normalized rect to preview window dimensions.
var bounds: CGRect = rect.boundingBox.applying(scale).applying(transform)

//Rest of code draws the bounds CGRect in yellow onto the preview window, as shown in the image.


No comments:

Post a Comment