Search Unity

Locatable Camera in Unity

Discussion in 'VR' started by JohannesPTC, Apr 20, 2016.

  1. JohannesPTC

    JohannesPTC

    Joined:
    Oct 9, 2012
    Posts:
    13
    I'm trying to figure out how to map positions from the RGB webcam on the HoloLens into 3D coordinates in the Unity scene (or more precisely, rays). For this, I need to know the camera pose and projection matrix.

    According to this developer page from Microsoft, in Unity "a CameraToWorldMatrix is automatically provided by LocatableWebCamFrame":
    https://developer.microsoft.com/en-...ra_to_Application-specified_Coordinate_System

    However, a LocatableWebCamFrame class does not exist in the Unity HoloLens technical preview :)

    I can only find a PhotoCaptureFrame class, and that is only returned asynchronously when taking still images. There seems to be nothing comparable for videos. Is there another way to get to this transformation?
     
    sdavari likes this.
  2. Unity_Wesley

    Unity_Wesley

    Unity Technologies

    Joined:
    Sep 17, 2015
    Posts:
    558
    Hello,

    Appears the documentation is a little out of date, LocatableWebCamFrame = PhotoCaptureFrame class. The only way you can get the projection matrix is by taking a picture.

    For example if you where exporting your photo to memory you would call TryGetProjectionMatrix on the photoCaptureFrame.

    photoCaptureFrame.TryGetProjectionMatrix(Camera.main.nearClipPlane, Camera.main.farClipPlane, out holoLensProjectionMatrix);

    the out holoLensProjectionMatrix should return the HoloLens Camera Matrix.

    Let me know if this helps. Also, if you provide some feedback on the implementation that would be awesome.

    Thank you,
    Wesley
     
  3. JohannesPTC

    JohannesPTC

    Joined:
    Oct 9, 2012
    Posts:
    13
    Thanks Wesley,

    Just tried the PhotoCaptureFrame class and it gives me the results I expect - that's very useful!
    However, I've got a few questions:

    - I tried to compute the offset of the RGB camera to the virtual camera using the cameraToWorldMatrix of the PhotoCaptureFrame. However, the PhotoCaptureFrame is returned asynchronously, so it is not clear if it should be the offset to the virtual camera at the time the callback is received, or when the TakePhotoAsync call was issued, or some time inbetween?

    - Is it same to assume that these matrices are consistent between photo and video capture (if the same resolution etc. is used)?
     
  4. Unity_Wesley

    Unity_Wesley

    Unity Technologies

    Joined:
    Sep 17, 2015
    Posts:
    558
    I believe the Projection matrix is a little offset from the Unity Main Camera, the HoloLens camera is a little higher and forward in the world (The physical location of the camera is used on the HoloLens Device). You get the matrix after the photo is finished so that the user can have the most accurate matrix at the time.

    Only photo capture has a projection matrices, video capture doesn't have this currently
     
  5. eXntrc

    eXntrc

    Joined:
    Jan 12, 2015
    Posts:
    21
    Hey @Unity_Wesley, thank you for taking the time to answer questions here!

    The Release Notes for Beta 14 say:
    • Photos that are saved to disk in the JPEG format will contain EXIF meta data.
    Does this EXIF meta data include the projection matrix? And if so, any chance you could show a small code snippet on how to load the image from disk and read the projection information back out in Unity?

    Thank you again!
     
  6. BrandonFogerty

    BrandonFogerty

    Joined:
    Jan 29, 2016
    Posts:
    83
    Hi eXntrc,

    The EXIF meta data does not include the projection matrix. Most of the information recorded reflects the platform the image was captured on and the various camera settings used when capturing the image. If you need to store the projection matrix on disk, you will need to do so yourself. I hope that helps!
     
    behram likes this.
  7. eXntrc

    eXntrc

    Joined:
    Jan 12, 2015
    Posts:
    21
    Thanks Brandon. I guess I'm a little confused about the purpose of the Locatable Camera providing its projection matrix if it's not stored with anything it captures. I feel like I could just record the position and direction of the users head gaze at the moment I make the request to capture a photo. Is the main reason the projection information comes in from the camera when a frame is captured simply because there is potentially a delay between when the request is made and when the frame is actually grabbed? And therefore there would be drift if we just captured the gaze at the time the request is made? Or is there more to it than that?

    Thanks!
     
  8. eXntrc

    eXntrc

    Joined:
    Jan 12, 2015
    Posts:
    21
    @Unity_Wesley and @BrandonFogerty

    Thank you both for your help on this topic. I finally got a sample working and I think I understand why @JohannesQ is confused about the offset. At least for me the matrix is off by several feet but only on 2 out of the 3 axis.

    I've recorded a video of the problem and posted it here:

    .

    Here is the code I'm calling:

    Code (CSharp):
    1. // Get the projection matrix
    2. Matrix4x4 cameraToWorld;
    3. if (!result.Frame.TryGetCameraToWorldMatrix(out cameraToWorld))
    4. {
    5.     ErrorManager.Instance.LogError("Captured frame didn't contain location information.");
    6.     return;
    7. }
    8.  
    9. // Convert 4x4 to standard coordinate system
    10. var position = cameraToWorld.MultiplyPoint(Vector3.zero);
    11. var direction = cameraToWorld.MultiplyVector(Vector3.forward);
    12.  
    13. // Instantiate the projectile at the position and rotation of this transform
    14. var clone = (GameObject)Instantiate(photoPrefab, position, Quaternion.LookRotation(direction));
    The result object in that code snippet above is my own class and it's part of my camera helper behavior. But the .Frame property is just returning the PhotoCaptureFrame given to me by the TakePhotoAsync callback.

    The rotation is correct, and the position seems to be correct on one axis, but it is not correct on the other axis and it is never at the correct height. (You can see in the video that the photos are too low.)

    I've noticed that closing the app and starting it again from a different location seems to yield different results, even in the same room. What am I doing wrong here?

    Thanks!!!
     
  9. BrandonFogerty

    BrandonFogerty

    Joined:
    Jan 29, 2016
    Posts:
    83
    Hi eXntrc.

    The information encoded in the projection matrix represents the real world rgb color camera. For example, the fov that the real world rgb color camera is using will likely be different from the fov on your virtual camera. In addition, as you stated, the matrices will be generated as soon as the photo is taken. The matrices provided by the PhotoCaptureFrame are useful if you want to take a picture and know where each pixel is in space relative to your real world rgb color camera.

    You don't need to use PhotoCapture if you are just trying to fire a projectile from the player's head. I would suggest that you instantiate the projectile from your virtual camera's position and move it along the virtual camera's forward vector. Could you give me a little more information as to what you are trying to do and what you are expecting to see? Thanks!
     
  10. eXntrc

    eXntrc

    Joined:
    Jan 12, 2015
    Posts:
    21
    Sure Brandon. I want to take a photo, then place that photo on a plane in such a position that when I walk back to that location and look at the plane, the photo would align perfectly with the world behind it. So, for example, if I took a photo of two people sitting on a couch and later went back to review that photo, those people may no longer be siting on the couch but the image would align in such a way that if I was standing at the same location I was standing when I took the photo they would appear to still be on the couch. The outer edges of the image would align with the scene behind the photo.

    Does that make sense?
     
  11. BrandonFogerty

    BrandonFogerty

    Joined:
    Jan 29, 2016
    Posts:
    83
    Hi eXntrc. Thanks for the explanation.

    The following is a test script I wrote to try and simulate what you are doing.

    Code (CSharp):
    1.  
    2. using UnityEngine;
    3. using System.Collections;
    4. using System.Collections.Generic;
    5. using UnityEngine.VR.WSA.WebCam;
    6. using UnityEngine.VR.WSA.Input;
    7.  
    8. public class PhotoCaptureImageMatrixExample : MonoBehaviour
    9. {
    10.     GestureRecognizer m_GestureRecognizer;
    11.     GameObject m_Quad = null;
    12.     Renderer m_QuadRenderer = null;
    13.     PhotoCapture m_PhotoCaptureObj;
    14.     CameraParameters m_CameraParameters;
    15.     bool m_CapturingPhoto = false;
    16.     Texture2D m_Texture = null;
    17.  
    18.     // Use this for initialization
    19.     void Start ()
    20.     {
    21.         Initialize();
    22.     }
    23.  
    24.     void SetupGestureRecognizer()
    25.     {
    26.         m_GestureRecognizer = new GestureRecognizer();
    27.         m_GestureRecognizer.SetRecognizableGestures(GestureSettings.Tap);
    28.         m_GestureRecognizer.TappedEvent += OnTappedEvent;
    29.         m_GestureRecognizer.StartCapturingGestures();
    30.  
    31.         m_CapturingPhoto = false;
    32.     }
    33.  
    34.     void Initialize()
    35.     {
    36.         Debug.Log("Initializing...");
    37.         List<Resolution> resolutions = new List<Resolution>(PhotoCapture.SupportedResolutions);
    38.         Resolution selectedResolution = resolutions[0];
    39.  
    40.         m_CameraParameters = new CameraParameters(WebCamMode.PhotoMode);
    41.         m_CameraParameters.cameraResolutionWidth = selectedResolution.width;
    42.         m_CameraParameters.cameraResolutionHeight = selectedResolution.height;
    43.         m_CameraParameters.hologramOpacity = 0.0f;
    44.         m_CameraParameters.pixelFormat = CapturePixelFormat.BGRA32;
    45.  
    46.         m_Texture = new Texture2D(selectedResolution.width, selectedResolution.height, TextureFormat.BGRA32, false);
    47.  
    48.         PhotoCapture.CreateAsync(false, OnCreatedPhotoCaptureObject);
    49.     }
    50.  
    51.     void OnCreatedPhotoCaptureObject(PhotoCapture captureObject)
    52.     {
    53.         m_PhotoCaptureObj = captureObject;
    54.         m_PhotoCaptureObj.StartPhotoModeAsync(m_CameraParameters, true, OnStartPhotoMode);
    55.     }
    56.  
    57.     void OnStartPhotoMode(PhotoCapture.PhotoCaptureResult result)
    58.     {
    59.         SetupGestureRecognizer();
    60.  
    61.         Debug.Log("Ready!");
    62.         Debug.Log("Air Tap to take a picture.");
    63.     }
    64.  
    65.     void OnTappedEvent(InteractionSourceKind source, int tapCount, Ray headRay)
    66.     {
    67.         if(m_CapturingPhoto)
    68.         {
    69.             return;
    70.         }
    71.  
    72.         m_CapturingPhoto = true;
    73.         Debug.Log("Taking picture...");
    74.         m_PhotoCaptureObj.TakePhotoAsync(OnPhotoCaptured);
    75.     }
    76.  
    77.     void OnPhotoCaptured(PhotoCapture.PhotoCaptureResult result, PhotoCaptureFrame photoCaptureFrame)
    78.     {
    79.         if( m_Quad == null )
    80.         {
    81.             m_Quad = GameObject.CreatePrimitive(PrimitiveType.Quad);
    82.             m_Quad.name = "PhotoCaptureFrame";
    83.             m_QuadRenderer = m_Quad.GetComponent<Renderer>() as Renderer;
    84.             m_QuadRenderer.material = new Material(Shader.Find("Custom/Unlit/UnlitTexture"));
    85.         }
    86.  
    87.         Matrix4x4 cameraToWorldMatrix;
    88.         photoCaptureFrame.TryGetCameraToWorldMatrix(out cameraToWorldMatrix);
    89.         Matrix4x4 worldToCameraMatrix = cameraToWorldMatrix.inverse;
    90.  
    91.         Matrix4x4 projectionMatrix;
    92.         photoCaptureFrame.TryGetProjectionMatrix(Camera.main.nearClipPlane, Camera.main.farClipPlane, out projectionMatrix);
    93.  
    94.         photoCaptureFrame.UploadImageDataToTexture(m_Texture);
    95.         m_QuadRenderer.material.SetTexture("_MainTex", m_Texture);
    96.  
    97.         Vector3 position = cameraToWorldMatrix.MultiplyPoint(Vector3.zero);
    98.         Quaternion rotation = Quaternion.LookRotation(-cameraToWorldMatrix.GetColumn(2), cameraToWorldMatrix.GetColumn(1));
    99.  
    100.         m_Quad.transform.position = position;
    101.         m_Quad.transform.rotation = rotation;
    102.  
    103.         Debug.Log("Took picture!");
    104.         m_CapturingPhoto = false;
    105.     }
    106. }
    107.  
    This script will place a captured image on a quad at the location and orientation of the web camera when the photo was taken.

    As you can see from the following video, I am not running into any issues with the position or orientation of my quad. I did try this test from many locations and it worked each time.



    What version of the technical preview are you using? Are you sure that you are using the correct PhotoCaptureFrame to position your image in world space? Can you try my example and see if you are still encountering the same issue? Thanks!
     
  12. eXntrc

    eXntrc

    Joined:
    Jan 12, 2015
    Posts:
    21
    Thank you very much @BrandonFogerty for replying. Sorry it's taken me so long to get back to you. Darn Real Life (TM) getting in the way of my HoloFun!


    Your sample helped me realize where I'd messed up. My code wasn't actually very far off, in fact I think my code was OK. But when your code exhibited similar problems it made me realize that when I had re-parented some things in the Unity scene editor it had caused some of my transforms to change. Once I reset them all to 0,0,0 everything seemed to be very similar to what you demonstrated in your video.

    A few questions:

    1. I see in your code you use this line:

    rotation = Quaternion.LookRotation(-cameraToWorldMatrix.GetColumn(2), cameraToWorldMatrix.GetColumn(1));

    That's a little confusing to me. I think I see what's going on but in my code I had used this:

    rotation = Quaternion.LookRotation(cameraToWorldMatrix.MultiplyVector(Vector3.back));

    They seem to do the same thing. Is mine OK or is it off in some way?


    2. I see in your code that you also got the Projection matrix, but it doesn't appear that you use it. What is the purpose of the projection matrix here?

    3. Most important: Both of our samples don't appear to line up perfectly with the world. In your video above you can see that the blue box is much much larger than it should be and I'm having a similar issue with my approach. I'm also having an issue where the distance of the quad from the floor seems off. Though the "altitude" of the quad seems about right if I take the picture standing at a neutral posture, the "altitude" of the quad doesn't seem right if I'm crouching down or standing on something elevated. Do you see similar problems?

    4. Any thoughts on how to get the correct aspect ratio of the quad? I'm thinking I'll just divide CameraParameters.Width and CameraParameters.Height and use the result to scale the quad on one axis.

    5. Any thoughts on how to scale the quad (not aspect ratio but actual scale) to give the appearance of aligning with the world behind? For example, in your video the cube in the photo is much larger than the physical cube behind it, even if you're standing at the location that the photo was taken. Maybe this is just an aspect ratio issue?

    6. Lastly, (and sorry for the noob question here) what's the easiest way to get a quad to render as 2-sided? Is that just a shader thing? I'm fine with using a quad because it's the fastest to render but it's unsettling when I walk across the room and turn around and all the pictures appear to have been removed! :)

    Thank you again very much for your help. This has gotten me well on track.
     
  13. BrandonFogerty

    BrandonFogerty

    Joined:
    Jan 29, 2016
    Posts:
    83
    Hi eXntrc! Sorry for the long delay in my response.

    The following are my responses to the questions you posed in order.

    1. I wanted the quad to face me so I am passing in the Camera's negated forward vector in world space. I am also using the camera's world space up for the up parameter of the LookRotation to ensure my vectors are orthogonal.

    2. Most of the time, we transform vertices by our virtual camera's view and projection matrices. However the view and projection matrices returned by PhotoCapture are not relative to your virtual camera. They are relative to the physical web camera. In your example, it was my understanding that you were having issues placing your quad in world space. So my example simply attempts to place a quad at the location where my physical web camera was when I took the picture. At that point, I simply applied the web camera texture, returned by PhotoCaptureFrame, to my quad. I did not apply the physical camera's projection matrix to the texture which means it will not line up correctly.

    3. The reason the image doesn't look correct is because the physical web camera has its own field of view and aspect ratio which need to be applied to the texture in a shader for it to line up correctly with the real world. This is what the projection matrix, returned by PhotoCaptureFrame, is used for.
    You will need to take this projection matrix and apply it to your texture in a custom shader.

    4. You can use the cameraParameters. Another way to do it is to extract the aspect ratio directly out of the PhotoCaptureFrame projection matrix like so.
    Code (CSharp):
    1. // Extract the Aspect Ratio from a Perspective Projection Matrix
    2. float halfOfVerticalFov = Mathf.Atan(1.0f / projectionMatrix.m11);
    3. float aspectRatio = (1.0f / Mathf.Tan(halfOfVerticalFov)) / projectionMatrix.m00;
    5. I believe it would line up more if you apply the projection matrix to the texture in a shader.

    6. The following shader will disable back face culling when rendering an object.
    Code (CSharp):
    1. Shader "Unlit/TextureWithNoCull"
    2. {
    3.     Properties
    4.     {
    5.         _MainTex ("Texture", 2D) = "white" {}
    6.     }
    7.     SubShader
    8.     {
    9.         Tags { "RenderType"="Opaque" }
    10.         Cull Off
    11.  
    12.         Pass
    13.         {
    14.             CGPROGRAM
    15.             #pragma vertex vert
    16.             #pragma fragment frag
    17.  
    18.             #include "UnityCG.cginc"
    19.  
    20.             struct appdata
    21.             {
    22.                 float4 vertex : POSITION;
    23.                 float2 uv : TEXCOORD0;
    24.             };
    25.  
    26.             struct v2f
    27.             {
    28.                 float4 vertex : SV_POSITION;
    29.                 float2 uv : TEXCOORD0;
    30.             };
    31.  
    32.             sampler2D _MainTex;
    33.             float4 _MainTex_ST;
    34.  
    35.             v2f vert (appdata v)
    36.             {
    37.                 v2f o;
    38.                 o.vertex = mul(UNITY_MATRIX_MVP, v.vertex);
    39.                 o.uv = TRANSFORM_TEX(v.uv, _MainTex);
    40.                 return o;
    41.             }
    42.  
    43.             fixed4 frag (v2f i) : SV_Target
    44.             {
    45.                 fixed4 col = tex2D(_MainTex, i.uv);
    46.  
    47.                 return col;
    48.             }
    49.             ENDCG
    50.         }
    51.     }
    52. }
    I hope that helps!
     
  14. BrandonFogerty

    BrandonFogerty

    Joined:
    Jan 29, 2016
    Posts:
    83
  15. erics_vulcan

    erics_vulcan

    Joined:
    Apr 2, 2016
    Posts:
    4
    We made a Unity plugin called CameraStream that has a goal of simplifying this problem (until Unity solves it directly). Ideally some of @BrandonFogerty 's work will be integrated into this so that we have easier to use matrices.

    Hopefully this helps some people.
     
  16. Xyy_1209

    Xyy_1209

    Joined:
    Nov 12, 2017
    Posts:
    6
    In your Code ,I See:
    Quaternion rotation = Quaternion.LookRotation(-cameraToWorldMatrix.GetColumn(2), cameraToWorldMatrix.GetColumn(1));

    Can you exlain the meaning of this line? What's the cameraToWorldMatrix form? Thank you!
     
  17. BrandonFogerty

    BrandonFogerty

    Joined:
    Jan 29, 2016
    Posts:
    83
    Hi @Xyy_1209,

    The physical camera is not the same thing as the HoloLens itself. The in-game main camera will automatically inherit the HoloLens' position/orientation. The physical web camera, that has the ability to take photos, is not located at the same physical location/orientation as the HoloLens itself. The cameraToWorldMatrix represents the transformation matrix from the physical web camera on the HoloLens to its in-game world position/rotation. This matrix is very useful to have if you need to know where your physical HoloLens camera is located in your game world relative to your various game objects.

    I wanted the in-game quad to precisely face the physical web camera on the HoloLens. The Quaternion.LookRotation requires a forward and an up vector. Those two vectors are used to generate a quaternion which can orient an object to look in a certain direction.
    https://docs.unity3d.com/ScriptReference/Quaternion.LookRotation.html

    The 1rst column in the cameraToWorldMatrix represents the camera's right vector.
    The 2nd column in the cameraToWorldMatrix represents the camera's up vector;
    The 3rd column in the cameraToWorldMatrix represents the camera's forward vector.

    The camera's forward vector will naturally point in the orientation of my head. However, I want the quad to be facing me and not away from me. If it faces away from me, I won't see it at all because of backface culling. Therefore I negate the camera's forward vector so that the quad will face in the exact opposite direction that I am looking which will make the quad face me. I hope that makes sense.
     
    frank420 and Xyy_1209 like this.
  18. Xyy_1209

    Xyy_1209

    Joined:
    Nov 12, 2017
    Posts:
    6
    Thank you very much for your so detailed explanation! I have gor a lot form your replay! Thanks again!