kaolin.render.camera¶
Kaolin provides extensive camera API. For an overview, see the Camera class docs.
API¶
Classes¶
Camera Conversions¶
Aligning camera conventions across different codebases can take time and care. Kaolin ships with converters between kaolin.render.camera.Camera and camera conventions in several popular codebases, including:
Community contributions are welcome to expand this set.
- kaolin.render.camera.kaolin_camera_to_gsplat_nerfstudio(kal_camera)¶
Convert Kaolin Camera to nerfstudio gsplat library camera parameters, as expected by
gsplat.rendering.rasterization. Batched conversion is supported. Only Pinhole camera model is covered.Note
This has been tested with the version gsplat==1.4.0.
- Parameters
kal_camera (Camera) – camera to convert.
- Returns
A dict with the following keys:
"Ks"(torch.Tensor): intrinsics matrix of shape \((C, 3, 3)\)."viewmats"(torch.Tensor): view matrix of shape \((C, 4, 4)\)."width"(int): image width from the source camera."height"(int): image height from the source camera."camera_model"(str): always"pinhole".
Cis the number of cameras.- Return type
(dict)
- Raises
RuntimeError – if
kal_cameradoes not use a pinhole intrinsics model.
- kaolin.render.camera.gsplat_nerfstudio_camera_to_kaolin(Ks, viewmats, width=None, height=None, camera_model='pinhole', near_plane=0.01, far_plane=100.0)¶
Convert nerfstudio gsplat library camera parameters, as expected by
gsplat.rendering.rasterization, to Kaolin Camera. Batched conversion is supported.- Parameters
Ks (torch.Tensor) – (C, 3, 3) matrix
viewmats (torch.Tensor) – (C, 4, 4) matrix
width (optional, int) – if not set, will guess value from Ks
height (optional, int) – if not set, will guess value from Ks
camera_model (optional, str) – currently only pinhole is supported
near_plane (optional, float) – near clipping plane, defines the min depth of the view frustum.
far_plane (optional, float) – far clipping plane, define the max depth of the view frustum.
- Returns
converted Kaolin camera.
- Return type
(Camera)
- kaolin.render.camera.kaolin_camera_to_gsplat_inria(kal_camera, gs_cam_cls)¶
Converts Kaolin Camera to INRIA gaussian splats camera (
gsplats.scene.cameras.Camera).Note
This has been tested with the version commit 472689c
- Parameters
kal_camera (Camera) – camera to convert.
gs_cam_cls (class) – This is the gsplats
Cameraclass, usually located in gsplats/scene/cameras.py.
- Returns
converted INRIA gaussian splats camera.
- Return type
(gsplats.scene.cameras.Camera)
- kaolin.render.camera.gsplat_inria_camera_to_kaolin(gs_camera)¶
Convert INRIA gaussian splats camera (
gsplats.scene.cameras.Camera) to Kaolin Camera.Note
This has been tested with the version commit 472689c
- Parameters
gs_camera (gsplats.scene.cameras.Camera) – camera to convert.
- Returns
converted Kaolin camera.
- Return type
(Camera)
- kaolin.render.camera.kaolin_camera_to_polyscope(camera)¶
Converts Kaolin Camera to a polyscope camera (
polyscope.core.CameraParameters). Polyscope cameras are always assumed to exist on a cpu device. The converted information includes the camera extrinsics, and intrinsics for the field of view.- Parameters
camera (Camera) – camera to convert.
- Returns
A polyscope camera object.
- Return type
(ps.core.CameraParameters)
- kaolin.render.camera.polyscope_camera_to_kaolin(ps_camera, width, height, near=0.01, far=100.0, dtype=torch.float32, device='cpu')¶
Converts a polyscope camera (
polyscope.core.CameraParameters) to Kaolin Camera. The converted information includes the camera extrinsics, the image plane dimensions and field of view. Additional parameters that kaolin cameras assume and polyscope does not, such as near, far plane and device can be passed explicitly if needed.- Parameters
ps_camera (ps.core.CameraParameters) – A polyscope camera object.
width (int) – Image plane width in pixels.
height (int) – Image plane height in pixels.
near (optional, float) – near clipping plane, defines the min depth of the view frustum.
far (optional, float) – far clipping plane, define the max depth of the view frustum.
dtype (optional, torch.dtype) – Datatype of the kaolin camera, converted from polyscope float32 precision.
device (optional, torch.device or str) – the device on which camera parameters will be allocated. Default: cpu
- Returns
A kaolin camera object.
- Return type
(Camera)
Functions¶
- class kaolin.render.camera.CameraFOV(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
IntEnumCamera’s field-of-view can be defined by either of the directions
- DIAGONAL = 2¶
- HORIZONTAL = 0¶
- VERTICAL = 1¶
- kaolin.render.camera.allclose(input, other, rtol=1e-05, atol=1e-08, equal_nan=False)¶
This function checks if the camera extrinsics and intrinsics, are close using
torch.allclose().- Parameters
- Returns
Result of the comparison
- Return type
(bool)
- kaolin.render.camera.blender_coords()¶
Blender world coordinates are right handed, with the z axis pointing upwards
Z Y ^ / | / |---------> X
- kaolin.render.camera.camera_path_generator(trajectory, frames_between_cameras=60, interpolation='catmull_rom')¶
A finite generator function for returning continuous camera objects an o path interpolated from a trajectory of cameras.
This generator is exhausted after it returns the last point on the path. If interpolation is ‘polynomial’ - the trajectory is assumed to have a list of at least 2 cameras. If interpolation is ‘catmull_rom’ - the trajectory is assumed to have a list of at least 4 cameras.
- Parameters
trajectory (List[kaolin.render.camera.Camera]) – A trajectory of camera nodes, used to form a continuous path. frames_between_cameras (int): Number of interpolated points generated between each pair of cameras on the trajectory. In essence, this value controls how detailed, or smooth the path is.
interpolation (str) – Type of interpolation function used: ‘polynomial’ uses a smoothstep polynomial function which tends to overshoot around the keyframes. This interpolator is fitting for paths orbiting an object of interest. ‘catmull_rom’ uses a spline defined by 4 control points, guaranteed to pass precisely through the keyframes.
frames_between_cameras (int) –
- Returns
An interpolated camera object formed by the cameras trajectory.
- Return type
(Iterator[kaolin.render.camera.Camera])
- kaolin.render.camera.down_from_homogeneous(homogeneous_vectors)¶
Performs perspective division by dividing each vector by its w coordinate.
Down-projects vectors from 4D homogeneous space to 3D space.
- Parameters
homogenenous_vectors – the inputs vectors, of shape \((..., 4)\)
homogeneous_vectors (Tensor) –
- Returns
the 3D vectors, of same shape than inputs but last dim to be 3
- Return type
- kaolin.render.camera.generate_centered_custom_resolution_pixel_coords(img_width, img_height, res_x=None, res_y=None, device=None)¶
Creates a pixel grid with a custom resolution, with the rays spaced out according to the scale. The scale is determined by the ratio of \(\text{img_width / res_x, img_height / res_y}\). The ray grid is of resolution \(\text{res_x} \times \text{res_y}\).
- Parameters
img_width (int) – width of camera image plane.
img_height (int) – height of camera image plane.
res_x (int) – x resolution of pixel grid to be created
res_y (int) – y resolution of pixel grid to be created
device (torch.device, optional) – Device on which the grid tensors will be created.
- Returns
A tuple of two tensors of shapes \((\text{height, width})\).
Tensor 0 contains rows of running indices: \((\text{s, s, ..., s})\) up to \((\text{height-s, height-s... height-s})\).
Tensor 1 contains repeated rows of indices: \((\text{s, s+1, ..., width-s})\).
\(\text{s}\) is \(\text{scale/2}\) where \(\text{scale}\) is \((\text{img_width / res_x, img_height, res_y})\).
- Return type
meshgrid (torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_centered_pixel_coords(img_width, img_height, device=None)¶
Creates a pixel grid with rays intersecting the center of each pixel. The ray grid is of resolution img_width x img_height.
- Parameters
img_width (int) – width of image.
img_height (int) – height of image.
device (torch.device, optional) – Device on which the grid tensors will be created.
- Returns
A tuple of two tensors of shapes \((\text{height, width})\).
Tensor 0 contains rows of running indices: \((\text{0.5, 0.5, ..., 0.5})\) up to \((\text{height-0.5, height-0.5... height-0.5})\).
Tensor 1 contains repeated rows of indices: \((\text{0.5, 1.5, ..., width-0.5})\).
- Return type
meshgrid (torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_default_grid(width, height, device=None)¶
Creates a pixel grid of integer coordinates with resolution width x height.
- Parameters
width (int) – width of image.
height (int) – height of image.
device (torch.device, optional) – Device on which the meshgrid tensors will be created.
- Returns
A tuple of two tensors of shapes \((\text{height, width})\).
Tensor 0 contains rows of running indices: \((\text{0, 0, ..., 0})\) up to \((\text{height-1, height-1... height-1})\).
Tensor 1 contains repeated rows of indices: \((\text{0, 1, ..., width-1})\).
- Return type
meshgrid (torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_ortho_rays(camera, coords_grid=None)¶
Default ray generation function for ortho cameras.
- Parameters
camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).
coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.
- Returns
The generated ortho rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\) .
- Return type
(torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_perspective_projection(fovyangle, ratio=1.0, dtype=torch.float32)¶
Generate perspective projection matrix for a given camera fovy angle.
- kaolin.render.camera.generate_pinhole_rays(camera, coords_grid=None)¶
Default ray generation function for pinhole cameras.
This function assumes that the principal point (the pinhole location) is specified by a displacement (camera.x0, camera.y0) in pixel coordinates from the center of the image.
The Kaolin camera class does not enforce a coordinate space for how the principal point is specified, so users will need to make sure that the correct principal point conventions are followed for the cameras passed into this function.
- Parameters
camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).
coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.
- Returns
The generated pinhole rays for the camera, as ray origins and ray direction tensors of \((\text{HxW, 3})\).
- Return type
(torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_rays(camera, coords_grid=None)¶
Default ray generation function for unbatched kaolin cameras. The camera lens type will determine the exact raygen logic that runs (i.e. pinhole, ortho..)
- Parameters
camera (kaolin.render.camera.Camera) – A single camera object (batch size 1).
coords_grid (torch.FloatTensor, optional) – Pixel grid of ray-intersecting coordinates of shape \((\text{H, W, 2})\). Coordinates integer parts represent the pixel \((\text{i, j})\) coords, and the fraction part of \([\text{0,1}]\) represents the location within the pixel itself. For example, a coordinate of \((\text{0.5, 0.5})\) represents the center of the top-left pixel.
- Returns
The generated camera rays according to the camera lens type, as ray origins and ray direction tensors of \((\text{HxW, 3})\).
- Return type
(torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_rotate_translate_matrices(camera_position, look_at, camera_up_direction)¶
Generate rotation and translation matrix for given camera parameters.
Formula is \(\text{P_cam} = \text{rot_mtx} * (\text{P_world} - \text{trans_mtx})\)
- Parameters
camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are
look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),
camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]
- Returns
the camera rotation matrix of shape \((\text{batch_size}, 3, 3)\) and the camera transformation matrix of shape \((\text{batch_size}, 3)\)
- Return type
(torch.FloatTensor, torch.FloatTensor)
- kaolin.render.camera.generate_transformation_matrix(camera_position, look_at, camera_up_direction)¶
Generate transformation matrix for given camera parameters.
Formula is \(\text{P_cam} = \text{P_world} * \text{transformation_mtx}\), with \(\text{P_world}\) being the points coordinates padded with 1.
- Parameters
camera_position (torch.FloatTensor) – camera positions of shape \((\text{batch_size}, 3)\), it means where your cameras are
look_at (torch.FloatTensor) – where the camera is watching, of shape \((\text{batch_size}, 3)\),
camera_up_direction (torch.FloatTensor) – camera up directions of shape \((\text{batch_size}, 3)\), it means what are your camera up directions, generally [0, 1, 0]
- Returns
The camera transformation matrix of shape \((\text{batch_size}, 4, 3)\).
- Return type
(torch.FloatTensor)
- kaolin.render.camera.gsplats_camera_to_kaolin(gs_camera)¶
Deprecated function name for INRIA camera conversion.
Use instead
gsplat_inria_camera_to_kaolin()
- kaolin.render.camera.kaolin_camera_to_gsplats(kal_camera, gs_cam_cls)¶
Deprecated function name for INRIA camera conversion.
for INRIA Gaussian Splats codebase, use:
kaolin_camera_to_gsplat_inria()for NerfStudio gsplat package, use:
kaolin_camera_to_gsplat_nerfstudio()
- kaolin.render.camera.loop_camera_path_generator(trajectory, frames_between_cameras=60, interpolation='polynomial', repeat=None)¶
A generator function for returning continuous camera objects an on a smoothed path interpolated from a trajectory of cameras.
The trajectory is assumed to have a list of at least 2 cameras, where the first and last cameras form a looped path. Certain interpolation modes (i.e. catmull_rom) may require additional cameras. If repeat is None, this generator is therefore never exhausted, and can be invoked infinitely to generate continuous camera motion. Otherwise the loop will repeat a finite number of times.
- Parameters
trajectory (List[kaolin.render.camera.Camera]) – A trajectory of camera nodes, used to form a continuous path.
frames_between_cameras (int) – Number of interpolated points generated between each pair of cameras on the trajectory. In essence, this value controls how detailed, or smooth the path is.
interpolation (str) – Type of interpolation function used: ‘polynomial’ uses a smoothstep polynomial function which tends to overshoot around the keyframes. This interpolator is fitting for paths orbiting an object of interest. ‘catmull_rom’ uses a spline defined by 4 control points, guaranteed to pass precisely through the keyframes.
repeat (int, Optional) – If specified, will limit the number of loops. Passing None results in an infinite loop.
- Returns
An interpolated camera object formed by the cameras trajectory.
- Return type
(Iterator[kaolin.render.camera.Camera])
- kaolin.render.camera.opengl_coords()¶
Contemporary OpenGL doesn’t enforce specific handedness on world coordinates. However it is common standard to define OpenGL world coordinates as right handed, with the y axis pointing upwards (cartesian):
Y ^ | |---------> X / Z
- kaolin.render.camera.perspective_camera(points, camera_proj)¶
Projects 3D points on 2D images in perspective projection mode.
- Parameters
points (torch.FloatTensor) – 3D points in camera coordinate, of shape \((\text{batch_size}, \text{num_points}, 3)\).
camera_proj (torch.FloatTensor) – projection matrix of shape \((3, 1)\).
- Returns
2D points on image plane of shape \((\text{batch_size}, \text{num_points}, 2)\).
- Return type
(torch.FloatTensor)
- kaolin.render.camera.register_backend(name)¶
Registers a representation backend class with a unique name.
CameraExtrinsics can switch between registered representations dynamically (see
switch_backend()).- Parameters
name (str) –
- kaolin.render.camera.rotate_translate_points(points, camera_rot, camera_trans)¶
Rotate and translate 3D points on based on rotation matrix and transformation matrix.
Formula is \(\text{P_new} = R * (\text{P_old} - T)\)
- Parameters
points (torch.FloatTensor) – 3D points, of shape \((\text{batch_size}, \text{num_points}, 3)\).
camera_rot (torch.FloatTensor) – rotation matrix, of shape \((\text{batch_size}, 3, 3)\).
camera_trans (torch.FloatTensor) – translation matrix, of shape \((\text{batch_size}, 3, 1)\).
- Returns
3D points in new rotation, of same shape than points.
- Return type
(torch.FloatTensor)
- kaolin.render.camera.up_to_homogeneous(vectors)¶
Up-projects vectors to homogeneous coordinates of four dimensions. If the vectors are already in homogeneous coordinates, this function return the inputs.
- Parameters
vectors (torch.Tensor) – the inputs vectors to project, of shape \((..., 3)\)
- Returns
The projected vectors, of same shape than inputs but last dim to be 4
- Return type