Visual Monocular 3 D Reconstruction and Component Identification for Small Spacecraft

A monocular vision pose estimation and identification algorithm used on a small 1 spacecraft for future orbital servicing is studied in this paper. A tracker spacecraft equipped with 2 a short-range vision system is proposed to recover the 3D structural model of a space target in 3 orbit and automatically identify its solar panels and main body using only visual information from 4 an onboard camera. The proposed reconstruction and identification framework is tested using 5 structure-from-motion and point cloud identification methods. The Efficient Perspective-n-Points 6 (EPnP) descriptor is used for pose estimation. Triangulated points are used for component 7 segmentation by means of orientation histogram descriptors. Experimental results based on 8 laboratory images of a spacecraft model show the effectiveness and robustness of our approach. 9


Introduction
Space object 3D reconstruction, pose estimation and identification is very important for spacecraft orbital servicing and space situational awareness based on satellite imaging.Structure from Motion technology and pose estimation has attracted a lot of interest as an enabling technology for detecting, tracking, cataloguing and identifying satellites and spacecraft in recent years.Structure from Motion (SfM) is a method for obtaining 3-D structures using only monocular feature matches between multiple images at multiple angles, which can include lines (e.g.Canny edge detection), corners (e.g.Harris corner detection, and other types of features.It also represents a natural progression into point cloud techniques from feature-based ego-motion estimation between pairs of images.3D reconstruction and identification has been studied extensively, but in order for such systems to work effectively on small spacecraft with only single visual sensors, the implementation of point cloud building, image feature point matching, sparse reconstruction, identification strategy and dimensional analysis information must be considered. High performance optical imaging sensors-such as radar, lidar, visible and infrared are used for detecting, tracking and identifying objects in orbit.Table 1 shows a list of existing on-orbit servicing missions and demonstrations.RF radar trades off precision for wide range of operation, and is not as suitable for uncooperative or small targets.The TriDAR system used a LIDAR and Iterative Closest Point system outside the ISS without approach or autonomy [1].Recent automated rendezvous and docking systems make use of optical, laser ranging, and LIDAR systems [2] [3] and visually-aided systems have been tested in proximity operations with NASA's Space Shuttle, JAXA's ETS-VII satellite [4] as well as other satellites such as the DART mission [5].The Rendezvous Lidar System (RLS) has also been tested on the XSS-11 spacecraft for rendezvous operations.However, the complexity, size, is great potential in the use of multiple-view imaging and feature mapping since only one camera may be necessary.Many pose estimation techniques [6] have been proposed for this, and typically focus on shape tracking and recognition, feature detection and triangulation [7], or a combination of shape and features [8].The SPHERES experiment uses SURF feature matching with stereo vision for navigation inside the ISS [9].Images of space objects using visible cameras are low resolution and lack texture information.These methodologies are related to computer vision challenges in terms of extreme lighting conditions, as specular reflection and hard shadows can lead to mission failure.A lot of studies have been done using Kalman-filter and other classic vision algorithms with 3D vision sensors for spacecraft on-orbit servicing [10] [11].There are a few related works that handle satellite recognition, pose estimation, 3D reconstruction and identification using vision only as well as using structure from motion [12] [13].
Based on the authors' previous work [12], we propose a different approach to the monocular visual estimation problem: recognition and tracking of features for ego-motion from a sequence of images, which can then be inserted into a point cloud, which in turn provides a way to recognize the position of the target.This method is derived from structure-from-motion computer vision methods used in robotics and in photo-tourism reconstructions from large image sets, and requires that only rigid transformations are present between images.To speed the development process and minimize poses is provided.We also add the components identification and dimensional analysis.The proposed vision pose estimation and identification system shows good performance in experimental results.
This work is intended to be applied and evaluated on a real mission in the near future.
The contributions of this work are summarized as follows.We review the following machine vision methods and how they are implemented: 1. ORB descriptor 2D feature detection and matching between images

Overview of Framework
To allow a tracker spacecraft to to identify and estimate the movement of a target spacecraft, we approach this problem as illustrated in Figure 1.First, we build up a feature set of points located in three dimensions by triangulation of keypoints on successive images of the target in the "Approach" phase.We then locate the camera relative to the matched points by Perspective-n-Point (PnP) solution during the "Track" phase.By projecting the keypoints into three dimensions, we build up a point cloud of the target over many more images in the "Observe" phase, which can then be matched in shape to a point cloud model, and the pose of the model accurately obtained by three-dimensional keypoint correspondences in the "Identify and Analyze" phase.In the end, the tracker spacecraft with robot arms and end effectors is intended to perform a projected "Capture and Servicing" phase.that are represented by small numerical sequences.We apply ORB (Oriented FAST and Rotated BRIEF) point descriptors for 2-D feature matching with high rotation invariance [27].We then use structure-from-motion methods to triangulate these points in space.

3D Reconstruction from Camera
A flowchart of the process we propose is shown in Figure 2, with details on each step provided in the following sections.A sequence of images can be captured or cached, features extracted using two-dimensional point descriptors that are stored in memory and matched in pairs to obtain a list of images with features, and also a list of features tracked across images.This list of feature correspondences is used to track the movement of keypoints across several poses, and if the triangulation is not good enough, a more different pose containing those features is selected.Using a pose solution, the points and camera are projected into global coordinates.The resulting scene point cloud can then be compared with a model cloud to identify the target by choosing a set of keypoints and extracting histogram descriptors for each with respect to point normals.By matching descriptors between the scene and model, the model and its pose can be found within the scene.

Keypoint Detection and Matching
A method of keypoint detection must be used to obtain keypoints from a sequence of images.
The where m pq = ∑ x,y x p y q I(x, y). ( The patch orientation can then be found by θ = atan2(m 01 , m 10 ) and is Gaussian smoothed.ORB then applies the BRIEF feature descriptor f n (p) = ∑ 1≤i≤n 2 i−1 τ(p; a i , b i ), a bit string result of binary intensity tests τ, each of which is defined from the intensity p(a) of a point at a relative to the intensity p(b) at a point at b by [28] τ(p; The descriptor is also steered according to the orientations computed for the FAST keypoints by rotating the feature set of points (a i , b i ) in 2xn matrix form by the patch orientation θ to obtain the rotated set F [27].
The steered BRIEF operator used in ORB then becomes table of steered BRIEF patterns is constructed from this to speed up computation of steered descriptors in subsequent points.
Keypoints are then matched between two images in the sequence by attempting to find a corresponding keypoint a in the second image that matches each point a in the first image, which can be done exhaustively by an XOR operation between each descriptor and a population count to obtain the Hamming distance.However, The FLANN (Fast Library for Approximate Nearest Neighbor) search algorithm built into OpenCV is used in current work as it performs much faster while still providing good matches [29].
The more features in common between these images, the more potentially good matches M f can be found, but it is essential that matches be correct correspondences or a valid transformation between the two images will be impossible.The matches M f are first coarsely pruned of bad pairings by finding the maximum distance between points d max and then removing all matches that have a coordinate distance d a of more than half the maximum distance between features using M g = M f (a)|d a < d max /2.

Three-Dimensional Projection
To obtain depth in a 3-D scene, an initial baseline for 3-D projection is first required using either stereoscopic vision, or two sequential images from different angles..The Fundamental Matrix F is the transformation matrix that maps each point in a first image to a second image, and the set of "good" matches M g is used where each keypoint a i in the first image is expected to map to a corresponding keypoint a i on the epipolar line in the second image by the relation a T i Fa i = 0, i = 1, . . ., n [30].
For three-dimensional space, this equation is linear and homogeneous and the matrix F has nine unknown coefficients, so F can be uniquely solved for by using eight keypoints with the method of Longuet-Higgins [31].However, due to image noise and distortion, linear least squares estimation (i.e.
or RANSAC [32] must be used to ensure that a "best" solution can be estimated.We use RANSAC for its speed to estimate F for all matches M g and estimate the associated epipolar lines [33]while removing outliers more than 0.1 from their epipolar line from M g to yield a final, reliable set of keypoint matches M h .To perform a projection into un-distorted space, a calibration matrix K is needed, either from calibration with a known pattern such as a checkerboard [34], or estimated for a size w × h image as A camera matrix is defined as C = K[R|t] with the rotation matrix R and the translation vector t defining the pose of the camera in space, and for two images, we define two camera matrices C1 and C2.To localize a point in un-distorted space, we formulate the so-called essential matrix that relates two matching undistorted points x and x in the camera plane as â T i E âi = 0, i = 1, . . ., n [35].In this way, E includes the "essential" assumption of calibrated cameras [36], and is related to the fundamental matrix by E.
After calculating E, we can find the location of a second camera C2 by assuming for simplicity that the first camera is uncalibrated and located at the origin (C1 = [I|0]).We decompose E = t × R into its component R and t matrices by using the singular value decomposition of E [37].We start with the orthogonal matrix W and and singular value decomposition (SVD) of E, defined as The matrix W does not directly depend on E, but provides a means of factorization for E. Detailed proofs can be found in [37] and are not reproduced here, but there are two possible factorizations of R, namely R = UW T V T and R = UWV T , and two possible choices for t, namely t = U(0, 0, 1) T and t = −U(0, 0, 1) T .Thus when determining the second camera matrix C2 = K[R|t], we have four choices in total.
it is now possible to triangulate the original un-distorted point positions in space with E and a pair of matched keypoints [a = (a x , a y ), b = (b x , b y )] ∈ M h using iterative linear least-squares triangulation [35].A point in three dimensions x = (x x , x y , x z , 1) written in the matrix equation form Ax = 0 results in four linear nonhomogeneous equations in four unknowns for an appropriate choice of A 4x4 .To solve this, we can write the system as Ax = B, with x = (x x , x y , x z ), and A 4x3 and B 4x1 as defined by Shil [38].The solution x by SVD is transformed to un-distorted space by x = KC1x, assuming that the point is neither at 0 nor at infinity.This triangulation must be performed four times for each combination of R and t and tested by perspective transformation with C1 and xz > 0 to ensure the resulting points p i are in front of the camera.

Image Selection
Using adjacent pairs of images in a closely-spaced time sequence allows feature points to be tracked more reliably between images, as there is less chance of conditions or change in angle causing a feature to change significantly.However, the disadvantage of using closely-spaced images for pose estimation is that a very small angular difference between two images will prevent triangulation solutions, like very distant points.Therefore, we track, match, and store keypoints between closely-spaced images, but only triangulate with images that are well-separated that contain tracked keypoints between the two.
If two few features are matched between image P t at time step t and P t−1 , the next image to be obtained P t+1 is used with P t−1 , if it fails then P t+2 is used, and so on until a predefined "reset" limit.Valid matches from the new image P t or later are added the the existing tracked keypoint list to associate feature numbers across the sequence of images.When obtaining the fundamental matrix F, only keypoints that have been associated between both images are used.

Pose Estimation
To finding the ego-motion of the tracker's camera relative to feature points represents the Perspective & Point (PnP) problem.For this, we apply the OpenCV implementation of the EPnP algorithm [39].For the n-point cloud with points p 1 . . .p n , four control points c i define the world coordinate system and are chosen with one point at the centroid of the point cloud and the rest oriented to form a basis.Each reference point is described in world coordinates (denoted with w ) as a linear combination of c i with weightings α ij .This coordinate system is consistent across linear transforms, so they have the same combination in the camera coordinate system (denoted with c .The known two-dimensional projections u i of the reference points p i are linked to these weightings by K considering that the projection involves scalar projective parameters w i , leading to the following. The Let the translation and rotation in world coordinates of the previous pose be t w (t − 1) and R w (t − 1), and that of the current pose be t w (t) and R w (t), for which we need to find the current camera matrix in world coordinates C w (t).
). Orientation is stored as a quaternion from the elements r ij of R w .

Object Pose Estimation
Object pose estimation focuses on the 3D reconstruction of the object and from the 3D point cloud features are extracted to detect and classify the detect.There are four main segmentation methods: local features [40] [41] [42], global features [43], graph matching [44] and machine learning [45] [46].
The use of point clouds in the presence of noise, varying mesh resolutions or poorly textured objects, clutter, and occlusion are very challenging [47] [48].Segmentation in unstructured environments is difficult [49].The image data for spacecraft and satellites in orbit are also often distorted and partially occluded due to shadowing.Using 3D point cloud-based recognition methods emphasizes overall shape and configuration over texture and can tolerate a degree of distortion and occlusion.We test the proposed system of using 3D keypoint descriptors by using images of a real satellite model.the actual orientation of the target is with respect to a known geometric model, or to identify specific parts of the target for interaction or analysis.For this task, we use the positional correspondences of three-dimensional keypoints selected from the constructed point cloud with respect to keypoints selected from a reference model point cloud that can be obtained in advance or on-line from another sequence of images with known relative pose.These 3D keypoints (not to be confused with the 2D keypoints used for triangulation) provide a means to compare models on a per-pose basis with accumulated points in the scene point cloud once a sufficient number of images has been acquired during the "Observation" phase.This makes it possible to match parts of a structure without requiring the entire structure to have keypoints, for example if the target is in partial shadow.It also allows us to match parts of the target separately given a sufficient number of points in the part that we are matching to.

Target Identification
Evidence of a particular pose and instance of the model in the scene is initialized before voting by obtaining the vector between a unique reference point C M and each model feature point F M i and transforming it into local coordinates by the transformation matrix the local x-y-z reference frame unit vectors L M i,x , L M i,y , and L M i,z .This precomputation can be done offline for the model in advance and is performed by calculating for each feature a vector For online pose estimation, Hough voting is performed by each scene feature F S j that has been found by FLANN matching to correspond with a model feature F M i , casting a vote for the position of the reference point C M in the scene.The transformation R M S L that makes these points line up can then be transformed into global coordinates with the scene reference frame unit vectors, scene reference point F S j and scene feature vector noisy model instances, though accuracy may be lower.In the case that multiple matches are identified, a criteria for determining which one is the most appropriate is necessary.We choose the match with the largest number of corresponding keypoints as the most likely correct match.

Satellite Component Identification
The remote capture of spacecraft is a highly sensitive operation that is carefully planned beforehand to minimize the chance of error.For this reason, an automated grasp planner is not a good fit for orbital capture of a known spacecraft.Rather, the exact point on the spacecraft should be specified beforehand using three-dimensional models, and the grasp planned based on the model and knowledge of the spacecraft's structure.The grasping operation can then be executed based on the position and motion of the target component.It is also necessary to verify the extents of the component and the whole spacecraft to ensure that no accidental contact is made during the grasping operation, which could cause both target and chaser to spin and separate before the grasp is completed.SHOT descriptors [41] are calculated by grouping together a set of local histograms over the volumes about the keypoint, where this volume is divided into by angle into 32 spherically-oriented spatial bins.Within a given radius r d of the keypoint, point counts from the local histograms are binned as a cosine function cos(θ i ) = n u • n v i of the angle θ i between the point normal within the corresponding part of the structure n v i and the feature point normal n u .This has the beneficial effects of creating a general rotational invariance since angles are relative to local normals, accumulating points into different bins as a result of small differences in relative directions, and creating a coarse partitioning that can be calculated fast with small cardinality [50].
Comparing the scene keypoint descriptors with the model keypoint descriptors to find good correspondence matches is done using a FLANN search on a k-dimensional tree (k-d tree) structure, similarly to the matching of image keypoints.Additionally, the BOrder Aware Repeatable Directions algorithm for local reference frame estimation (BOARD) is used to calculate local reference frames for each three-dimensional SHOT descriptor [51] to make them independent of global coordinates for rotation and translation invariance.Once a set of nearest correspondences and local reference frames is found, clustering of correspondences to given cluster sizes set by a parameter r c is performed by pre-computed Hough voting to make recognition of shapes more robust to partial occlusion and clutter [52].At least a threshold of n thresh votes in Hough space is needed to estimate a valid pose.

3D Reconstruction and Identification
To test the identification of small satellite components, we use an engineering model of a small satellite with full-length fold-out solar panels, shown in Figure 3.This satellite serves as an example target for a simulated tracker satellite.Using the same process, point clouds were obtained of a solar panel and the satellite body itself.
Figure 7 shows the point clouds generated for these components.

3D Component Identification
Each of the component point clouds shown in Figure 7 was sequentially matched with the scene of the small satellite in Figure 6.To illustrate the matching process, the point cloud matched for each component is marked in yellow with keypoints indicated in green, and the scene is in full colour with keypoints marked in blue.The points of the matched component within the scene are indicated in red to show where the component's location has been identified.First, the solar panels were matched.Figure 8 shows the best match for the solar panel model, which corresponds with the left-side solar panel in the scene.and clustering threshold of n thresh = 5.The number of correspondences and percentage of error observed in both rotation and translation is shown in Table 2.As there are less keypoints in smaller components such as the solar panels, they exhibit higher error in correspondence.Increasing the number of keypoints (and computational time) serves to mitigate this problem.

Dimensional Analysis
For each component identified on the spacecraft, we in addition estimate its size for purposes of planning and grasping for the chaser spacecraft.Table 3 shows the dimensions of the components estimated during the identification process, compared with actual measurements of size.The measurements of size in each direction are performed with respect to the coordinate axes for each component model, and simply indicate the extents of the scene points that have been matched with the model.For the spacecraft considered here, this is suitable since all components are rectangular in form except for the entire satellite as a unit.The detected dimensions of each component are larger than their actual values because the scene points exhibit some degree of statistical variation due to numerical inaccuracies during the triangulation process, and this must be accounted for in planning and control of capture operations as well.This is particularly true for the Z axis measurement of the thin solar panels.The closer and more accurately the chaser spacecraft can observe the target, the smaller these triangulation errors will be, since triangulation error increases with distance..

Parameter Effects on Spacecraft Pose Identification Accuracy
To illustrate the accuracy of pose estimation while varying the descriptor radius r d and cluster size The descriptor radius and cluster size for these estimates, with the resulting number of correspondences and rounded cumulative errors in translation and rotation are shown in Table 4.As more scene points are added over time, accuracy can increase, but only if they are consistent with the existing scene.We can see from these results that increasing the size of the SHOT descriptor will increase the number of keypoints available and result in better accuracy and higher likelihood of identifying a shape, but also will require longer processing times.Cluster sizes must be set appropriately for the point cloud size, as a cluster size too small or too large will prevent valid instances from being found, and result in decreased accuracy.
The patent-free ORB algorithm that combines FAST keypoint detection and BRIEF feature descriptors provides good tolerance to rotation and scaling of features for this purpose.For useful reconstruction, it is important to identify as many features as possible, so target spacecraft with many colors, edges, and shapes generally provide the best results for feature-based systems such as this.It is important to note that this method of motion estimation provides best solutions through post-processing of results.The more images that are included when creating the structure, the better triangulation will be.If processing power and storage is available to include a large number of recent images, such as by observing the target through multiple rotations, a better solution for motion will be obtained.To additionally decrease the processing time if desired, the camera image can be lowered in resolution, or pixels can be under-sampled by choosing only every 2nd pixel or every 4th pixel in a staggered pattern over the image for feature matching [53].

Occlusion Effects on Spacecraft Pose Identification Accuracy
In the space environment, it is common that components are partially or fully occluded by shadows, which can be cast by either the chaser spacecraft or other components of the target spacecraft.
These shadows are total in an airless environment and prevent any features from being detected in a shadowed scene.To evaluate the effects of partial shadowing on the small spacecraft model, features were removed from the scene point cloud used in previous tests so that along the length of the spacecraft, the first 25%, 50%, and then 75% of features are in shadow, as shown in Figure 14, Figure 15, and Figure 16 respectively.All tests use a descriptor radius r d = 0.5m and a cluster Size r c = 0.25m.

Timing and Profiling
To profile the processing requirements of the described algorithms on a system that could potentially be embedded into a satellite, the algorithm was run on a 667MHz ARM Cortex-A9 processor over the VGA images of the satellite engineering model used above, and raw timing statistics gathered for the processing time of each algorithm.Tests 1 and 2 were performed with 6524 model points and 5584 scene points from 220 images, and tests 3 and 4 were performed with 6524 model points and 1816 scene points from 32 images.Tests 1 and 3 were performed with a descriptor radius of 0.05 and cluster size of 0.1, and Tests 2 and 4 were performed with a descriptor radius of 0.1 and cluster size of 0.5.Table 6 and Table 7 show the timing information obtained in seconds for each of the described algorithms in these cases.While accurate matching of large models and scenes can take on the order of minutes, this does not prevent a chaser spacecraft from building a motion model over long periods of time from stored images before acting to rendezvous, and both software and hardware acceleration methods may be used to further improve this performance.

Conclusions
This study proposes a 3D pose estimation, recognition and identification system for a small spacecraft servicing mission that uses a monocular camera sensor.This study uses Structure from Motion (SFM) to build 3D model from 2D images and a SHOT descriptor to identify surface shape components.The EPnP process estimates object poses and increases the system's ability to identify position and angles.The experimental results show that the proposed system can effectively identify components and poses of a spacecraft model in the lab.Potential application of this system to an orbital demonstration mission with industry partners is under investigation.
In this work, we have described a feature-based visual identification system that allows a tracker spacecraft to track relative movement to a target and ultimately acquire pose estimates using point cloud techniques.Using projective geometry, we perform three-dimensional reconstruction of features on the target from a sequence of images taken with a single camera.It is intended that even small spacecraft with a single camera could take advantage of this system.Work is underway to scale this system to a level suitable for small satellite use, which could provide a technology demonstration with a minimum of cost and risk.As the performance of feature tracking depends very heavily on the design of the feature descriptor and method of matching, further comparison of descriptor types for both two-dimensional and three-dimensional matching is warranted, and FPGA acceleration is being developed for this system.Future work also includes the validation of these methods with a variety of different spacecraft and vision hardware, and under a broader set of varying conditions to evaluate the robustness of feature-based systems.

2 . 5 .
Multi-view feature triangulation and PnP solution for ego-motion (structure-from-motion) 3. Characterization of point cloud shapes using the 3D SHOT descriptor 4. Point cloud correspondence using FLANN and Hough voting for object and partial object recognition We also perform the following laboratory tests using an engineering model of a small satellite sequentially imaged at multiple angles to simulate observation of a tumbling target by a tracker satellite: Identification and dimensional analysis of small satellite components by comparing a component model to a scene 6.Comparison of the effects of variation in SHOT parameters to pose identification accuracy 7. Investigation of the effects of partial spacecraft occlusion on pose identification accuracy 8. Evaluation of timing required for processing on a representative embedded processor The structure of this paper is as follows: Section 1 provides the background and value of our work.Overall workflow and principals are described in Section 2. The results and discussion are shown in Section 3. The conclusions are given in Section 4.

1 T , c c 2 T , c c 3 T , c c 4 T
expansion of this equation has 12 unknown control points and n projective parameters.Two linear equations can be obtained for each reference point to obtain a system of the form Mx = 0, where the null space or kernel of the matrix M 2nx12 gives the solution x = [c c ] to the system of equations, which can be expressed as x = ∑ m i=1 β i v i .The set v i is composed of the null eigenvectors of the product M T M corresponding to m null singular values of M. The method of solving for the coefficients β 1 . . .β m depends on the size of m, and four different methods are used in the literature[39] for practical solution.
. The votes cast by V S i,G are thresholded to find the most likely instance of the model in the scene, although multiple peaks in the Hough space are fairly common and can indicate multiple possibilities for model instances.Due to the statistical nature of Hough voting, it is possible to recognize partially-occluded or Satellite components are identified by first preparing exemplar point clouds, such as a model of a solar panel, that can be stored and used for reference by the tracker spacecraft.These model point clouds are then located in the actual reconstructed 3-D scene point cloud created by the tracker spacecraft.We focus on the solar panels as an example of external satellite components that are easy to grasp and manipulate in a rendezvous operation, and the body of the spacecraft that indicates overall positioning.Solar panels may also not remain at a precise angle with respect to the spacecraft body, and therefore must be identified in isolation from the spacecraft body to ensure accuracy.The identification process begins with a set of three-dimensional keypoints being chosen from both the scene and the model by randomly choosing individual points from the cloud separated by a given sampling radius r k .Normals are calculated for these keypoints relative to nearby points so that each Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 22 January 2018 doi:10.20944/preprints201801.0195.v1keypoint has a repeatable orientation.The keypoints are then associated with three-dimensional SHOT point descriptors.

Figure 3 .
Figure 3. Small satellite engineering model

Figure 9 Figure 8 .Figure 9 .Figure 10 .
Figure10shows the body of the satellite identified.The parameters used for the SHOT descriptors in these tests were a model and scene sampling radius of r k = 0.025m, reference frame and descriptor radius of r d = 0.5m, cluster size of r c = 0.25m,

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 22 January 2018 doi:10.20944/preprints201801.0195.v1 coding
[26]rs and complexity, we make use of the open-source OpenCV (Open Computer Vision) and PCL (Point Cloud Library) libraries for most of the computer vision programming.We consider the situation of a rendezvous zone where spacecraft are separated by several meters or tens of meters, with the intention of matching velocity and attitude for rendezvous.Precise manoeuvring and capture requires the use of short-range sensing on the satellite itself.Outside of the range where optical sensors are useful, other sensors can be used for coarse positioning and estimation such as GNSS and telemetry from ground tracking stations.The flexibility of visual-only pose estimation also means that it has many potential applications in other fields such as planetary rover navigation, but the movement of hardware complexity to software complexity in vision systems requires a corresponding increase in computing resources.Hardened computing hardware for space can take between several seconds to several minutes for simple image recognition tasks.The Mars Exploration Rovers required 42 seconds to process a single image pair for navigation with no recognition task[26].In this work, the ORB descriptor is used with FLANN matching as an open alternative to SIFT and SURF for feature detection.Point Cloud Library provides the framework for processing, storage, and visualization of the point cloud, and a review of multiple-view geometry used to create a point cloud from multiple

Posted: 22 January 2018 doi:10.20944/preprints201801.0195.v1
The PnP solution across a sequence of images allows us to track the pose of the tracker spacecraft relative to features on the target spacecraft.However, in most cases it is necessary to identify whatPreprints (www.preprints.org)| NOT PEER-REVIEWED |

Table 2 .
Correspondences and Error resulting from varying Descriptor Radius and Cluster Size

Table 3 .
Dimensional Analysis of Spacecraft Components

Table 4 .
Correspondences and Error resulting from varying Descriptor Radius and Cluster Size

Table 5 .
Correspondences and Error resulting from varying Descriptor Radius and Cluster Size

Table 6 .
Timing for Features, Triangulation and PnP in seconds

Table 7 .
Timing for Correspondence and Identification in seconds