Output list
Book chapter
AEDNet: Adaptive Embedding and Multiview-Aware Disentanglement for Point Cloud Completion
Published 2024
Computer Vision – ECCV 2024, 127 - 143
Point cloud completion involves inferring missing parts of 3D objects from incomplete point cloud data. It requires a model that understands the global structure of the object and reconstructs local details. To this end, we propose a global perception and local attention network, termed AEDNet, for point cloud completion. The proposed AEDNet utilizes designed adaptive point cloud embedding and disentanglement (AED) module in both the encoder and decoder to globally embed and locally disentangle the given point cloud. In the AED module, we introduce a global embedding operator that employs the devised slot attention to compose point clouds into different embeddings, each focusing on specific parts of 3D objects. Then, we proposed a multiview-aware disentanglement operator to disentangle geometric information from those embeddings in the 3D viewpoints generated on a unit sphere. These 3D viewpoints enable us to observe point clouds from the outside rather than from within, resulting in a comprehensive understanding of their geometry. Additionally, the arbitrary number of points and point-wise features can be disentangled by changing the number of viewpoints, reaching high flexibility. Experiments show that our proposed method achieves state-of-the-art results on both MVP and PCN datasets.
Book chapter
A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-Shaped Structures
Published 2024
Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXVII, 326 - 341
We propose the first comprehensive approach for modeling and analyzing the spatiotemporal shape variability in tree-like 4D objects, i.e., 3D objects whose shapes bend, stretch and change in their branching structure over time as they deform, grow, and interact with their environment. Our key contribution is the representation of tree-like 3D shapes using Square Root Velocity Function Trees (SRVFT) [21]. By solving the spatial registration in the SRVFT space, which is equipped with an L2\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\mathbb {L}^2$$\end{document} metric, 4D tree-shaped structures become time-parameterized trajectories in this space. This reduces the problem of modeling and analyzing 4D tree-like shapes to that of modeling and analyzing elastic trajectories in the SRVFT space, where elasticity refers to time warping. In this paper, we propose a novel mathematical representation of the shape space of such trajectories, a Riemannian metric on that space, and computational tools for fast and accurate spatiotemporal registration and geodesics computation between 4D tree-shaped structures. Leveraging these building blocks, we develop a full framework for modelling the spatiotemporal variability using statistical models and generating novel 4D tree-like structures from a set of exemplars. We demonstrate and validate the proposed framework using real 4D plant data.
Book chapter
RCNN for region of interest detection in whole slide images
Published 2020
Neural Information Processing, 1333, 625 - 632
Digital pathology has attracted significant attention in recent years. Analysis of Whole Slide Images (WSIs) is challenging because they are very large, i.e., of Giga-pixel resolution. Identifying Regions of Interest (ROIs) is the first step for pathologists to analyse further the regions of diagnostic interest for cancer detection and other anomalies. In this paper, we investigate the use of RCNN, which is a deep machine learning technique, for detecting such ROIs only using a small number of labelled WSIs for training. For experimentation, we used real WSIs from a public hospital pathology service in Western Australia. We used 60 WSIs for training the RCNN model and another 12 WSIs for testing. The model was further tested on a new set of unseen WSIs. The results show that RCNN can be effectively used for ROI detection from WSIs.
Book chapter
RGB-D image-based Object Detection: From traditional methods to deep learning techniques
Published 2019
RGB-D Image Analysis and Processing, 169 - 201
Object detection from RGB images is a long-standing problem in image processing and computer vision. It has applications in various domains including robotics, surveillance, human-computer interaction, and medical diagnosis. With the availability of low cost 3D scanners, a large number of RGB-D object detection approaches have been proposed in the past years. This chapter provides a comprehensive survey of the recent developments in this field. We structure the chapter into two parts; the focus of the first part is on techniques that are based on hand-crafted features combined with machine learning algorithms. The focus of the second part is on the more recent work, which is based on deep learning. Deep learning techniques, coupled with the availability of large training datasets, have now revolutionized the field of computer vision, including RGB-D object detection, achieving an unprecedented level of performance. We survey the key contributions, summarize the most commonly used pipelines, discuss their benefits and limitations, and highlight some important directions for future research.
Book chapter
A survey on nonrigid 3D shape analysis
Published 2018
Academic Press Library in Signal Processing, Volume 6, 261 - 304
Shape is an important physical property of natural and manmade 3D objects that characterizes their external appearances. Understanding differences between shapes and modeling the variability within and across shape classes, hereinafter referred to as shape analysis, are fundamental problems to many applications, ranging from computer vision and computer graphics to biology and medicine. This chapter provides an overview of some of the recent techniques that studied the shape of 3D objects that undergo nonrigid deformations including bending and stretching. Recent surveys that covered some aspects such as classification, retrieval, recognition, and rigid or nonrigid registration, focused on methods that use shape descriptors. Descriptors, however, provide abstract representations that do not enable the exploration of shape variability. In this chapter, we focus on recent techniques that treated the shape of 3D objects as points in some high dimensional space where paths describe deformations. Equipping the space with a suitable metric enables the quantification of the range of deformations of a given shape, which in turn enables (1) comparing and classifying 3D objects based on their shape, (2) computing smooth deformations, i.e., geodesics, between pairs of objects, and (3) modeling and exploring continuous shape variability in a collection of 3D models. This chapter surveys and classifies recent developments in this field, outlines fundamental issues, discusses their potential applications in computer vision and graphics, and highlights opportunities for future research. Our primary goal is to bridge the gap between various techniques that have been often independently proposed by different communities including mathematics and statistics, computer vision and graphics, and medical image analysis.
Book chapter
Exploring Visuo-Haptic augmented reality user interfaces for Stereo-Tactic neurosurgery planning
Published 2016
Medical Imaging and Augmented Reality, 9805, 208 - 220
Stereo-tactic neurosurgery planning is a time-consuming and complex task that requires detailed understanding of the patient anatomy and the affected regions in the brain to precisely deliver the treatment and to avoid proximity to any known risk structures. Traditional user interfaces for neurosurgery planning use keyboard and mouse for interaction and visualize the medical data on a screen. Previous research, however, has shown that 3D user interfaces are more intuitive for navigating volumetric data and enable users to understand spatial relations more quickly. Furthermore, new imaging modalities and automated segmentation of relevant structures provide important information to medical experts. However, displaying such information requires frequent context switches or occludes otherwise important information. In collaboration with medical experts, we analyzed the planning workflow for stereo-tactic neurosurgery interventions and identified two tasks in the process that can be improved: volume exploration and trajectory refinement. In this paper, we present a novel 3D user interface for neurosurgery planning that is implemented using a head-mounted display and a haptic device. The proposed system improves volume exploration with bi-manual interaction to control oblique slicing of volumetric data and reduces visual clutter with the help of haptic guides that enable users to precisely target regions of interest and to avoid proximity to known risk structures.
Book chapter
Elastic shape analysis of surfaces and images
Published 2016
Riemannian Computing in Computer Vision, 257 - 277
We describe two Riemannian frameworks for statistical shape analysis of parameterized surfaces. These methods provide tools for registration, comparison, deformation, averaging, statistical modeling, and random sampling of surface shapes. A crucial property of both of these frameworks is that they are invariant to reparameterizations of surfaces. Thus, they result in natural shape comparisons and statistics. The first method we describe is based on a special representation of surfaces termed square-root functions (SRFs). The pullback of the L2 metric from the SRF space results in the Riemannian metric on the space of surfaces. The second method is based on the elastic surface metric. We show that a restriction of this metric, which we call the partial elastic metric, becomes the standard L2 metric under the square-root normal field (SRNF) representation. We show the advantages of these methods by computing geodesic paths between highly articulated surfaces and shape statistics of manually generated surfaces. We also describe applications of this framework to image registration and medical diagnosis.
Book chapter
Published 2015
Computer Vision -- ACCV 2014, 9004, 424 - 439
We consider boundaries of planar objects as level set distance functions and present a Riemannian metric for their comparison and analysis. The metric is based on a parameterization-invariant framework for shape analysis of quadrilateral surfaces. Most previous Riemannian formulations of 2D shape analysis are restricted to curves that can be parameterized with a single parameter domain. However, 2D shapes may contain multiple connected components and many internal details that cannot be captured with such parameterizations. In this paper we propose to register planar curves of arbitrary topologies by utilizing the re-parameterization group of quadrilateral surfaces. The criterion used for computing this registration is a proper distance, which can be used to quantify differences between the level set functions and is especially useful in classification. We demonstrate this framework with multiple examples using toy curves, medical imaging data, subsets of the TOSCA data set, 2D hand-drawn sketches, and a 2D version of the SHREC07 data set. We demonstrate that our method outperforms the state-of-the-art in the classification of 2D sketches and performs well compared to other state-of-the-art methods on complex planar shapes.
Book chapter
Generation of 3D canonical anatomical models: An experience on carpal bones
Published 2015
New Trends in Image Analysis and Processing -- ICIAP 2015 Workshops, 9281, 167 - 174
The paper discusses the initial results obtained for the generation of canonical 3D models of anatomical parts, built on real patient data. 3D canonical models of anatomy are key elements in a computer-assisted diagnosis; for instance, they can support pathology detection, semantic annotation of patient-specific 3D reconstructions, quantification of pathological markers. Our approach is focused on carpal bones and on the elastic analysis of 3D reconstructions of these bones, which are segmented from MRI scans, represented as 0-genus triangle meshes, and parameterized on the sphere. The original method [8] relies on a set of sparse correspondences, defined as matching vertices. For medical applications, it is desirable to constrain the mean shape generation to set-up the correspondences among a larger set of anatomical landmarks, including vertices, lines, and areas. Preliminary results are discussed and future development directions are sketched.
Book chapter
Graspable parts recognition in Man-Made 3D shapes
Published 2013
Computer Vision – ACCV 2012: 11th Asian Conference on Computer Vision, Daejeon, Korea, November 5-9, 2012, Revised Selected Papers, Part II, 7725, 552 - 564
We address the problem of automatic recognition of graspable parts in man-made 3D shapes, which exhibit high intra-class variability that cannot be captured with geometric descriptors alone. We observe that, in the presence of significant geometric and topological variations, the context of a part within a 3D shape provides important cues about its functionality. We propose to model the context as structural relationships between shape parts and use them, in addition to part geometry, as cues for identifying automatically the graspable parts. We design a set of spatial relationships that can be extracted from general shapes. Then, we propose a new similarity measure that captures a part context and enables better recognition of graspable parts. We use this property to design a classifier that learns the semantics of a shape part. We demonstrate that our approach outperforms the state-of-the-art approaches that are purely geometric-based.