kaolin.render.camera

Kaolin provides extensive camera API. For an overview, see the Camera class docs.

API

Classes

Camera Conversions

Aligning camera conventions across different codebases can take time and care. Kaolin ships with converters between kaolin.render.camera.Camera and camera conventions in several popular codebases, including:

Community contributions are welcome to expand this set.

kaolin.render.camera.kaolin_camera_to_gsplat_nerfstudio(kal_camera)

Convert Kaolin Camera to nerfstudio gsplat library camera parameters, as expected by gsplat.rendering.rasterization. Batched conversion is supported. Only Pinhole camera model is covered.

Note

This has been tested with the version gsplat==1.4.0.

Parameters

kal_camera (Camera) – camera to convert.

Returns

A dict with the following keys:

  • "Ks" (torch.Tensor): intrinsics matrix of shape \((C, 3, 3)\).

  • "viewmats" (torch.Tensor): view matrix of shape \((C, 4, 4)\).

  • "width" (int): image width from the source camera.

  • "height" (int): image height from the source camera.

  • "camera_model" (str): always "pinhole".

C is the number of cameras.

Return type

(dict)

Raises

RuntimeError – if kal_camera does not use a pinhole intrinsics model.

kaolin.render.camera.gsplat_nerfstudio_camera_to_kaolin(Ks, viewmats, width=None, height=None, camera_model='pinhole', near_plane=0.01, far_plane=100.0)

Convert nerfstudio gsplat library camera parameters, as expected by gsplat.rendering.rasterization, to Kaolin Camera. Batched conversion is supported.

Parameters
  • Ks (torch.Tensor) – (C, 3, 3) matrix

  • viewmats (torch.Tensor) – (C, 4, 4) matrix

  • width (optional, int) – if not set, will guess value from Ks

  • height (optional, int) – if not set, will guess value from Ks

  • camera_model (optional, str) – currently only pinhole is supported

  • near_plane (optional, float) – near clipping plane, defines the min depth of the view frustum.

  • far_plane (optional, float) – far clipping plane, define the max depth of the view frustum.

Returns

converted Kaolin camera.

Return type

(Camera)

kaolin.render.camera.kaolin_camera_to_gsplat_inria(kal_camera, gs_cam_cls)

Converts Kaolin Camera to INRIA gaussian splats camera (gsplats.scene.cameras.Camera).

Note

This has been tested with the version commit 472689c

Parameters
  • kal_camera (Camera) – camera to convert.

  • gs_cam_cls (class) – This is the gsplats Camera class, usually located in gsplats/scene/cameras.py.

Returns

converted INRIA gaussian splats camera.

Return type

(gsplats.scene.cameras.Camera)

kaolin.render.camera.gsplat_inria_camera_to_kaolin(gs_camera)

Convert INRIA gaussian splats camera (gsplats.scene.cameras.Camera) to Kaolin Camera.

Note

This has been tested with the version commit 472689c

Parameters

gs_camera (gsplats.scene.cameras.Camera) – camera to convert.

Returns

converted Kaolin camera.

Return type

(Camera)

kaolin.render.camera.kaolin_camera_to_polyscope(camera)

Converts Kaolin Camera to a polyscope camera (polyscope.core.CameraParameters). Polyscope cameras are always assumed to exist on a cpu device. The converted information includes the camera extrinsics, and intrinsics for the field of view.

Parameters

camera (Camera) – camera to convert.

Returns

A polyscope camera object.

Return type

(ps.core.CameraParameters)

kaolin.render.camera.polyscope_camera_to_kaolin(ps_camera, width, height, near=0.01, far=100.0, dtype=torch.float32, device='cpu')

Converts a polyscope camera (polyscope.core.CameraParameters) to Kaolin Camera. The converted information includes the camera extrinsics, the image plane dimensions and field of view. Additional parameters that kaolin cameras assume and polyscope does not, such as near, far plane and device can be passed explicitly if needed.

Parameters
  • ps_camera (ps.core.CameraParameters) – A polyscope camera object.

  • width (int) – Image plane width in pixels.

  • height (int) – Image plane height in pixels.

  • near (optional, float) – near clipping plane, defines the min depth of the view frustum.

  • far (optional, float) – far clipping plane, define the max depth of the view frustum.

  • dtype (optional, torch.dtype) – Datatype of the kaolin camera, converted from polyscope float32 precision.

  • device (optional, torch.device or str) – the device on which camera parameters will be allocated. Default: cpu

Returns

A kaolin camera object.

Return type

(Camera)

Functions

class kaolin.render.camera.CameraFOV(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: IntEnum

Camera’s field-of-view can be defined by either of the directions

DIAGONAL = 2
HORIZONTAL = 0
VERTICAL = 1
kaolin.render.camera.allclose(input, other, rtol=1e-05, atol=1e-08, equal_nan=False)

This function checks if the camera extrinsics and intrinsics, are close using torch.allclose().

Parameters
  • input (Camera) – first camera to compare

  • other (Camera) – second camera to compare

  • atol (float, optional) – absolute tolerance. Default: 1e-08

  • rtol (float, optional) – relative tolerance. Default: 1e-05

  • equal_nan (bool, optional) – if True, then two NaN s will be considered equal. Default: False

Returns

Result of the comparison

Return type

(bool)

kaolin.render.camera.blender_coords()

Blender world coordinates are right handed, with the z axis pointing upwards

Z      Y
^    /
|  /
|---------> X
kaolin.render.camera.camera_path_generator(trajectory, frames_between_cameras=60, interpolation='catmull_rom')

A finite generator function for returning continuous camera objects an o path interpolated from a trajectory of cameras.

This generator is exhausted after it returns the last point on the path. If interpolation is ‘polynomial’ - the trajectory is assumed to have a list of at least 2 cameras. If interpolation is ‘catmull_rom’ - the trajectory is assumed to have a list of at least 4 cameras.

Parameters
  • trajectory (List[kaolin.render.camera.Camera]) – A trajectory of camera nodes, used to form a continuous path. frames_between_cameras (int): Number of interpolated points generated between each pair of cameras on the trajectory. In essence, this value controls how detailed, or smooth the path is.

  • interpolation (str) – Type of interpolation function used: ‘polynomial’ uses a smoothstep polynomial function which tends to overshoot around the keyframes. This interpolator is fitting for paths orbiting an object of interest. ‘catmull_rom’ uses a spline defined by 4 control points, guaranteed to pass precisely through the keyframes.

  • frames_between_cameras (int) –

Returns

An interpolated camera object formed by the cameras trajectory.

Return type

(Iterator[kaolin.render.camera.Camera])

kaolin.render.camera.down_from_homogeneous(homogeneous_vectors)
  1. Performs perspective division by dividing each vector by its w coordinate.

  2. Down-projects vectors from 4D homogeneous space to 3D space.

Parameters
  • homogenenous_vectors – the inputs vectors, of shape \((..., 4)\)

  • homogeneous_vectors (Tensor) –

Returns

the 3D vectors, of same shape than inputs but last dim to be 3

Return type

(torch.Tensor)

kaolin.render.camera.generate_centered_custom_resolution_pixel_coords(img_width, img_height, res_x=None, res_y=None, device=None)

Creates a pixel grid with a custom resolution, with the rays spaced out according to the scale. The scale is determined by the ratio of \(\text{img_width / res_x, img_height / res_y}\). The ray grid is of resolution \(\text{res_x} \times \text{res_y}\).

Parameters
  • img_width (int) – width of camera image plane.

  • img_height (int) – height of camera image plane.

  • res_x (int) – x resolution of pixel grid to be created

  • res_y (int) – y resolution of pixel grid to be created

  • device (torch.device, optional) – Device on which the grid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{s, s, ..., s})\) up to \((\text{height-s, height-s... height-s})\).

Tensor 1 contains repeated rows of indices: \((\text{s, s+1, ..., width-s})\).

\(\text{s}\) is \(\text{scale/2}\) where \(\text{scale}\) is \((\text{img_width / res_x, img_height, res_y})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_centered_pixel_coords(img_width, img_height, device=None)

Creates a pixel grid with rays intersecting the center of each pixel. The ray grid is of resolution img_width x img_height.

Parameters
  • img_width (int) – width of image.

  • img_height (int) – height of image.

  • device (torch.device, optional) – Device on which the grid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{0.5, 0.5, ..., 0.5})\) up to \((\text{height-0.5, height-0.5... height-0.5})\).

Tensor 1 contains repeated rows of indices: \((\text{0.5, 1.5, ..., width-0.5})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_default_grid(width, height, device=None)

Creates a pixel grid of integer coordinates with resolution width x height.

Parameters
  • width (int) – width of image.

  • height (int) – height of image.

  • device (torch.device, optional) – Device on which the meshgrid tensors will be created.

Returns

A tuple of two tensors of shapes \((\text{height, width})\).

Tensor 0 contains rows of running indices: \((\text{0, 0, ..., 0})\) up to \((\text{height-1, height-1... height-1})\).

Tensor 1 contains repeated rows of indices: \((\text{0, 1, ..., width-1})\).

Return type

meshgrid (torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_ortho_rays(camera, coords_grid=None)

Default ray generation function for ortho cameras.

Parameters
  • camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).

  • coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated ortho rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\) .

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_perspective_projection(fovyangle, ratio=1.0, dtype=torch.float32)

Generate perspective projection matrix for a given camera fovy angle.

Parameters
  • fovyangle (float) – field of view angle of y axis, \(tan(\frac{fovy}{2}) = \frac{y}{f}\).

  • ratio (float) – aspect ratio \((\frac{width}{height})\). Default: 1.0.

Returns

camera projection matrix, of shape \((3, 1)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.generate_pinhole_rays(camera, coords_grid=None)

Default ray generation function for pinhole cameras.

This function assumes that the principal point (the pinhole location) is specified by a displacement (camera.x0, camera.y0) in pixel coordinates from the center of the image.

The Kaolin camera class does not enforce a coordinate space for how the principal point is specified, so users will need to make sure that the correct principal point conventions are followed for the cameras passed into this function.

Parameters
  • camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).

  • coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated pinhole rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\).

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_rays(camera, coords_grid=None)

Default ray generation function for unbatched kaolin cameras. The camera lens type will determine the exact raygen logic that runs (i.e. pinhole, ortho..)

Parameters
  • camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).

  • coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.

Returns

The generated camera rays according to the camera lens type, as ray origins and ray direction tensors of \((\text{HxW, 3})\).

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_rotate_translate_matrices(camera_position, look_at, camera_up_direction)

Generate rotation and translation matrix for given camera parameters.

Formula is \(\text{P_cam} = \text{rot_mtx} * (\text{P_world} - \text{trans_mtx})\)

Parameters
  • camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are

  • look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),

  • camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]

Returns

the camera rotation matrix of shape \((\text{batch_size}, 3, 3)\) and the camera transformation matrix of shape \((\text{batch_size}, 3)\)

Return type

(torch.FloatTensor, torch.FloatTensor)

kaolin.render.camera.generate_transformation_matrix(camera_position, look_at, camera_up_direction)

Generate transformation matrix for given camera parameters.

Formula is \(\text{P_cam} = \text{P_world} * \text{transformation_mtx}\), with \(\text{P_world}\) being the points coordinates padded with 1.

Parameters
  • camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are

  • look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),

  • camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]

Returns

The camera transformation matrix of shape \((\text{batch_size}, 4, 3)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.gsplats_camera_to_kaolin(gs_camera)

Deprecated function name for INRIA camera conversion.

Use instead gsplat_inria_camera_to_kaolin()

kaolin.render.camera.kaolin_camera_to_gsplats(kal_camera, gs_cam_cls)

Deprecated function name for INRIA camera conversion.

kaolin.render.camera.loop_camera_path_generator(trajectory, frames_between_cameras=60, interpolation='polynomial', repeat=None)

A generator function for returning continuous camera objects an on a smoothed path interpolated from a trajectory of cameras.

The trajectory is assumed to have a list of at least 2 cameras, where the first and last cameras form a looped path. Certain interpolation modes (i.e. catmull_rom) may require additional cameras. If repeat is None, this generator is therefore never exhausted, and can be invoked infinitely to generate continuous camera motion. Otherwise the loop will repeat a finite number of times.

Parameters
  • trajectory (List[kaolin.render.camera.Camera]) – A trajectory of camera nodes, used to form a continuous path.

  • frames_between_cameras (int) – Number of interpolated points generated between each pair of cameras on the trajectory. In essence, this value controls how detailed, or smooth the path is.

  • interpolation (str) – Type of interpolation function used: ‘polynomial’ uses a smoothstep polynomial function which tends to overshoot around the keyframes. This interpolator is fitting for paths orbiting an object of interest. ‘catmull_rom’ uses a spline defined by 4 control points, guaranteed to pass precisely through the keyframes.

  • repeat (int, Optional) – If specified, will limit the number of loops. Passing None results in an infinite loop.

Returns

An interpolated camera object formed by the cameras trajectory.

Return type

(Iterator[kaolin.render.camera.Camera])

kaolin.render.camera.opengl_coords()

Contemporary OpenGL doesn’t enforce specific handedness on world coordinates. However it is common standard to define OpenGL world coordinates as right handed, with the y axis pointing upwards (cartesian):

   Y
   ^
   |
   |---------> X
  /
Z
kaolin.render.camera.perspective_camera(points, camera_proj)

Projects 3D points on 2D images in perspective projection mode.

Parameters
  • points (torch.FloatTensor) – 3D points in camera coordinate, of shape \((\text{batch_size}, \text{num_points}, 3)\).

  • camera_proj (torch.FloatTensor) – projection matrix of shape \((3, 1)\).

Returns

2D points on image plane of shape \((\text{batch_size}, \text{num_points}, 2)\).

Return type

(torch.FloatTensor)

kaolin.render.camera.register_backend(name)

Registers a representation backend class with a unique name.

CameraExtrinsics can switch between registered representations dynamically (see switch_backend()).

Parameters

name (str) –

kaolin.render.camera.rotate_translate_points(points, camera_rot, camera_trans)

Rotate and translate 3D points on based on rotation matrix and transformation matrix.

Formula is \(\text{P_new} = R * (\text{P_old} - T)\)

Parameters
  • points (torch.FloatTensor) – 3D points, of shape \((\text{batch_size}, \text{num_points}, 3)\).

  • camera_rot (torch.FloatTensor) – rotation matrix, of shape \((\text{batch_size}, 3, 3)\).

  • camera_trans (torch.FloatTensor) – translation matrix, of shape \((\text{batch_size}, 3, 1)\).

Returns

3D points in new rotation, of same shape than points.

Return type

(torch.FloatTensor)

kaolin.render.camera.up_to_homogeneous(vectors)

Up-projects vectors to homogeneous coordinates of four dimensions. If the vectors are already in homogeneous coordinates, this function return the inputs.

Parameters

vectors (torch.Tensor) – the inputs vectors to project, of shape \((..., 3)\)

Returns

The projected vectors, of same shape than inputs but last dim to be 4

Return type

(torch.Tensor)