Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OpenCV's Hand-Eye Calibration for Comparison #912

Open
Kazadhum opened this issue Mar 29, 2024 · 43 comments
Open

Implement OpenCV's Hand-Eye Calibration for Comparison #912

Kazadhum opened this issue Mar 29, 2024 · 43 comments
Assignees
Labels
enhancement New feature or request

Comments

@Kazadhum
Copy link
Collaborator

Hello! I'm trying to "ATOM-ize" OpenCV's Hand-Eye calibration so we can compare our results with theirs. This issue is related to #891.

I've seen people implement OpenCV's calibrateHandEye method [1] [2] but I don't think this is the same as our method and, as such, is not comparable.

In this method, the problem is formulated as $AX = XB$, and the only transformation estimated is the transformation between the camera and the gripper.

I think what we want to use is the calibrateRobotWorldHandEye() method, where the $AX=ZB$ problem is addressed.

I'm currently trying to get from ATOM's dataset the necessary transformations to perform the calibrations (right now I'm going to try getting the transform from the camera to the calibration pattern).

@Kazadhum Kazadhum added the enhancement New feature or request label Mar 29, 2024
@Kazadhum Kazadhum self-assigned this Mar 29, 2024
@Kazadhum
Copy link
Collaborator Author

Hi @miguelriemoliveira and @manuelgitgomes ! I want to get the transformation from the camera to the pattern based on the detections in a given collection. Before I go dive into how to do this I wanted to ask if this is already done in ATOM somewhere?

@Kazadhum
Copy link
Collaborator Author

I think I found an example of it in the patterns.py module.

_, rvecs, tvecs = cv2.solvePnP(
objp[ids], np.array(corners, dtype=np.float32), K, D)
cv2.drawFrameAxes(image, K, D, rvecs, tvecs, length)

I think this is done by the solvePnP() function.

Kazadhum added a commit that referenced this issue Mar 29, 2024
@Kazadhum
Copy link
Collaborator Author

Hello @miguelriemoliveira and @manuelgitgomes ! I think I got some progress done, but not a lot.

I'm still only considering one collection instead of the whole dataset, to make testing easier. I can now calculate the transformation from the pattern to the camera here:

#####################################
# Get transform from target to cam
#####################################
# NOTE: The documentation calls this the transform from world to cam because in their example (which is eye-in-hand), their pattern's frame is their world frame. This is not the case in ours. So when the documentation mentions the world to cam tf, it corresponds to our pattern/target to camera tf. (I think.)
# Use solvePnP() to get this transformation
# Need to check if these calculations (from here to l. 108) are correct
print('Calculating transform from camera to pattern for collection 006....')
_, rvec, tvec = cv2.solvePnP(objp, undistorted_imgpoints_camera[0], K, D)
print(f'rvec = {rvec}')
print(f'tvec = {tvec}')
# Convert it into an homogenous transform
cam_optical_frame_T_pattern = traslationRodriguesToTransform(tvec,rvec)
print(f'Transform from the camera\'s optical frame to the pattern frame = {cam_optical_frame_T_pattern}')
# Get tf from camera's optical frame to camera frame
tmp_transforms = dataset['collections']['006']['transforms']
cam_T_cam_optical_frame = getTransform('rgb_world_link','rgb_world_optical_frame', tmp_transforms)
print(f'Transform from cam to cam optical = {cam_T_cam_optical_frame}')
cam_T_pattern = np.dot(cam_T_cam_optical_frame, cam_optical_frame_T_pattern)
print(f'Transform from camera to pattern = {cam_T_pattern}')
# I think I need to invert the transform matrix
pattern_T_cam = np.linalg.inv(cam_T_pattern)
#Split into R and t matrices for the calibration function
pattern_R_cam = pattern_T_cam[0:3, 0:3]

Admittedly, this code is a work in progress and the variable names aren't the best they could be. Some things are also hardcoded for now for testing reasons. I'm unsure how to check whether the values this calculates are correct. How should I test this?

Additionally, assuming the values in the matrix are correct, calling the cv2.calibrateRobotWorldHandEye() returns an error:

Traceback (most recent call last):
  File "./cv_handeye_calib.py", line 209, in <module>
    main()
  File "./cv_handeye_calib.py", line 205, in main
    cvHandEyeCalibrate(objp=objp, dataset=dataset, camera=camera, pattern=pattern, number_of_corners=number_of_corners)
  File "./cv_handeye_calib.py", line 126, in cvHandEyeCalibrate
    cv2.calibrateRobotWorldHandEye(pattern_R_cam, pattern_t_cam, base_R_gripper, base_t_gripper)
cv2.error: OpenCV(4.6.0) /io/opencv/modules/core/src/matmul.dispatch.cpp:363: error: (-215:Assertion failed) a_size.width == len in function 'gemm'

To replicate, run the command:

clear && ./cv_handeye_calib.py -json ~/atom_datasets/riwmpbot_real/merged/dataset_train.json -c rgb_world -p hand_pattern

Maybe we should meet sometime next week, since I have some doubts about how to go about coding this. When are you available?

@miguelriemoliveira
Copy link
Member

miguelriemoliveira commented Mar 30, 2024

Hi @Kazadhum ,

I am looking into this. Using this opencv page as reference:

https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html#ga41b1a8dd70eae371eba707d101729c36

So the transformations are:

cTw (world to camera) -> this is given by a solve pnp of with the a pattern detector. This is the B in equation (1) of our paper to iros.

NOTE in the iros paper we use the camera to world, not sure if this is just semantics or the inverse transform is used.
For example the cTw, we read from left to right, as in world to camera, but in opencv they read inversely, i.e., camera to world.

gTb (base to gripper) -> this is given by the robot's direct kinematics. This is the A in our iros paper.

So I guess what we need to do in a script is to translate our atom datasets to these lists of matrices (vector<[Mat]) and call the opencv function.

@miguelriemoliveira
Copy link
Member

miguelriemoliveira commented Mar 30, 2024

I am using this (may be helpful to replicate)

rosrun atom_evaluation cv_handeye_calib.py -json $ATOM_DATASETS/riwbot/train/dataset.json -c rgb_world -p pattern_1

This is using the riwbot atom example.

@miguelriemoliveira
Copy link
Member

I confirm the error above...

@miguelriemoliveira
Copy link
Member

So the error is because you are giving the function numpy arrays, where it needs a list of numpy arrays (at least 3, says the documentation).

I will try to refactor the function to create these lists ...

@miguelriemoliveira
Copy link
Member

Fixed it. now it does not crash. Not sure if its returning correct results though. @Kazadhum let me know if you need more help ...

miguelriemoliveira added a commit that referenced this issue Mar 30, 2024
@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 1, 2024

I'm picking this back up...

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 1, 2024

Now that this is running and there's no errors, one thing I don't understand is that we use as inputs 3 collections and we get two transformations as output:

  • Base to World;
  • Gripper to Camera;

The thing is, what collection does this output refer to? Base to World shoud be static, I think, but the Gripper to Camera transform should change with different collections, no?

Maybe I'm missing something obvious...

@miguelriemoliveira
Copy link
Member

Hi @Kazadhum ,

use riwbot as the example.

Here those transformations are both static, check https://github.com/lardemua/atom/tree/noetic-devel/atom_examples/riwbot#configuring-the-calibration

The camera does not move w.r.t. to the gripper. Its also static.

@miguelriemoliveira
Copy link
Member

If you want we can talk a bit later ...

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 1, 2024

Hello @miguelriemoliveira! I understand now, but maybe we should still talk... What time is good for you?

@miguelriemoliveira
Copy link
Member

18h?

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 1, 2024

Sounds good!

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 4, 2024

Now adding a function to save the calibrated dataset, based on what was done for the ICP Calibration:

def saveICPCalibration(dataset, sensor_source, sensor_target, transform, json_file, descriptor):
"""
Save a JSON file with the data of the ICP calibration
"""
res = atomicTfFromCalibration(dataset, sensor_target, sensor_source, transform)
child_link = dataset['calibration_config']['sensors'][sensor_source]['child_link']
parent_link = dataset['calibration_config']['sensors'][sensor_source]['parent_link']
frame = parent_link + '-' + child_link
quat = tf.transformations.quaternion_from_matrix(res)
for collection_key, collection in dataset['collections'].items():
dataset['collections'][collection_key]['transforms'][frame]['quat'] = quat
dataset['collections'][collection_key]['transforms'][frame]['trans'] = res[0:3, 3]
# Save results to a json file
filename_results_json = os.path.dirname(
json_file) + f'/ICPCalibration_{sensor_source}_to_{sensor_target}_{descriptor}.json'
saveAtomDataset(filename_results_json, dataset)

Kazadhum added a commit that referenced this issue Apr 9, 2024
Kazadhum added a commit that referenced this issue Apr 9, 2024
@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 9, 2024

I created a new dataset for rihbot, which I named train_diogo. I had to do this for testing because the original dataset for rihbot only had one collection with a complete detection (a minimum of 3 are needed). This new one has 12 collections and all of them have every corner detected.

@miguelriemoliveira I can't sync the rihbot dataset folder with your NAS right now since I'm at the lab, but once I'm home I'll sync it.

@miguelriemoliveira
Copy link
Member

@miguelriemoliveira I can't sync the rihbot dataset folder with your NAS right now since I'm at the lab, but once I'm home I'll sync it.

Why not?

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented Apr 9, 2024

I think it's something to do with eduroam but I'm not sure

@miguelriemoliveira
Copy link
Member

Hum ... @manuelgitgomes do you have the same problem?

@miguelriemoliveira
Copy link
Member

I suppose this is it:

rosrun atom_evaluation cv_handeye_calib.py -json $ATOM_DATASETS/rihbot/train_diogo/dataset.json -p pattern_1 -c rgb_hand

you confirm?

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented May 2, 2024

Hello @miguelriemoliveira! I just got a new laptop and I'm still setting it up so I can't confirm it, but I believe I didn't use rosrun and just ran it as a script. It should work the same though. I think the arguments are all correct. Just to check, I think I did:

./cv_handeye_calib.py -c rgb_hand -p pattern_1 -json $ATOM_DATASETS/rihbot/train_diogo/dataset.json

@miguelriemoliveira
Copy link
Member

miguelriemoliveira commented May 2, 2024

better my way because you don't have to be in the directory of the script.

@miguelriemoliveira
Copy link
Member

miguelriemoliveira commented May 4, 2024

Hi @Kazadhum ,

finally I was able to make this work. This was a challenge.
Now its working. Some details:

  1. It runs for all 4 opencv supported methods, i.e. 'tsai', 'park', 'horaud', 'andreff', 'daniilidis'],
    which means we can have four alternative methods :-).
    Check the --method_name flag

  2. It converts the estimated hand_T_camera transformation to the one we are actually searching for as per atom configuration, i.e. calibration_parent_T_calibration_child. Tested and seems to be working.

  3. It compares against ground truth, if -cgtg flag is used

image

  1. It saves a json dataset with the calibration, e.g. hand_eye_daniilidis_rgb_hand.json

miguelriemoliveira added a commit that referenced this issue May 4, 2024
@miguelriemoliveira
Copy link
Member

Here's how to use

clear && rosrun atom_evaluation cv_eye_in_hand.py -json $ATOM_DATASETS/rihbot/train_test_opencv/dataset.json -p pattern_1 -c rgb_hand -hl flange -bl base_link -mn daniilidis -ctgt

miguelriemoliveira added a commit that referenced this issue May 5, 2024
@miguelriemoliveira
Copy link
Member

@Kazadhum , I am done with this. Take a look and see if you agree.

Things for you:

  1. Close this issue if you confirm all is working fine.
  2. Delete your old code in the cv_hand_eye_calib.py
  3. Create a eye-to-base script using data from riwbot to test (probably in another issue).

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented May 5, 2024

Ok, thank you @miguelriemoliveira! I will take a look at the code on the train to aveiro and give you feedback.

On Monday I'll create the eye-to-base script and keep working on #939

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented May 5, 2024

Hi @miguelriemoliveira! I took a look at the script and everything seems alright to me (much cleaner than what I had written). I also tested it and it worked just fine. So I think I can close this issue.

I already found somethings that I'm going to have to change in the eye-to-hand script, so we're definitely going to have two scripts, like we discussed. Thank you for your help! I'm now going to start working on the eye-to-hand case, based on this one (on #943). I'm also going to delete my old script.

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented May 8, 2024

Reopening to fix the issue found in #943...

In the eye-in-hand case, the estimated h_T_c is almost the same as the ground truth, but the result of the comparison seems exagerated.

Running:
rosrun atom_evaluation cv_eye_in_hand.py -c rgb_hand -p pattern_1 -hl flange -bl base_link -json $ATOM_DATASETS/rihbot/train_test_opencv/dataset.json -ctgt

returns:

Ground Truth h_T_c=
[[ 0.     0.     1.    -0.02 ]
 [-1.    -0.     0.     0.   ]
 [ 0.    -1.     0.     0.065]
 [ 0.     0.     0.     1.   ]]
estimated h_T_c=
[[ 0.    -0.     1.    -0.021]
 [-1.    -0.001  0.    -0.001]
 [ 0.001 -1.    -0.     0.065]
 [ 0.     0.     0.     1.   ]]
Etrans = 0.182 (m)
Erot = 0.021 (deg)
+----------------------+-------------+---------+----------+-------------+------------+
|      Transform       | Description | Et0 [m] |  Et [m]  | Rrot0 [rad] | Erot [rad] |
+----------------------+-------------+---------+----------+-------------+------------+
| flange-rgb_hand_link |   rgb_hand  |   0.0   | 0.000762 |     0.0     |  0.000365  |
+----------------------+-------------+---------+----------+-------------+------------+

The Etrans looks like it's a lot higher than it should be, judging from the matrices...

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented May 8, 2024

In:

print('Etrans = ' + str(round(translation_error*1000, 3)) + ' (m)')

isn't this value supposed to be in mm, since it comes from the comparison between tfs and then is multiplied by 1000?

If so, then these errors make sense

@Kazadhum
Copy link
Collaborator Author

Kazadhum commented May 8, 2024

Hi @miguelriemoliveira! If you agree with this change, this can be closed again, because the eye-in-hand case is working okay I think.

@Kazadhum Kazadhum reopened this May 8, 2024
@miguelriemoliveira
Copy link
Member

So the problem from yesterday is not visible in the hand in eye ... that's good news I think.

@Kazadhum
Copy link
Collaborator Author

Hello @miguelriemoliveira and @manuelgitgomes!

In my last e-mail, I said the implementation of noise in the OpenCV calibration scripts was one of my next tasks. Today, I tested its implementation and registered that the results with noise and without noise were the exact same. This also happened for larger values of noise.

Having noted this, I looked at the code and reached the conclusion that the implementation of noise as is done in ATOM doesn't make sense for the OpenCV methods.

Firstly, ATOM's optimization is an iterative method, whereas OpenCV doesn't perform an optimization. The noise we add to a dataset to test ATOM's calibration is added to the initial guess. Considering the difference in methods, I think there isn't really a good way to introduce noise to the OpenCV calibration in an appropriate way to compare with ATOM.

Secondly, there's how OpenCV's method computes the $A$ and $B$ matrices. One of these is calculated through the solvePnP() function, so the noise doesn't change its value;

# ---------------------------------------
# Get c_T_p using pattern detection
# ---------------------------------------
# we will us the solve pnp function from opencv.
# This requires a np array of the pattern corners 3d coordinates
# and another np array of the correponding 2d pixels.
# 3d coordinates were extracted previously, so lets go to 2d..
pts_2d = np.ones((number_of_corners, 2), np.float32)
for idx, point in enumerate(collection['labels'][args['pattern']][args['camera']]['idxs']):
pts_2d[idx, 0] = point['x']
pts_2d[idx, 1] = point['y']
retval, rvec, tvec = cv2.solvePnP(pts_3d, pts_2d, K, D)
if not retval:
raise atomError('solvePnP failed.')

The other is calculated by forward kinematics. However, for the eye-to-hand case, the transformation chain from the hand link to the base link isn't affected by the noise:

# ---------------------------------------
# Get h_T_b using direct kinematics
# ---------------------------------------
# We will use the atom builtin functions to get the transform directly
h_T_b = getTransform(from_frame=args['hand_link'],
to_frame=args['base_link'],
transforms=collection['transforms'])
h_T_b_lst.append(h_T_b)

With this in mind, I don't really think it makes sense to implement this. Do you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

3 participants