Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speech with Entra: Azure Speech Endpoint: Unable to contact server. StatusCode: 1006 #2692

Open
NicoPlattner opened this issue Dec 4, 2024 · 1 comment
Labels
update needed For items that are in progress but have not been updated

Comments

@NicoPlattner
Copy link

NicoPlattner commented Dec 4, 2024

Describe the bug

I am getting the following error trying to use the Speech SDK with Entra authentication.
Error: Unable to contact server. StatusCode: 1006, wss://westeurope.tts.speech.microsoft.com/cognitiveservices/websocket/v1 Reason: undefined

I am using a Cognitive Services Multi account ressource and when I query the translation endpoint everything works fine with my auth token, however it doesn't when trying to use the speech endpoint with the speech sdk.

To Reproduce

Here's my minimal version using next.js

dependencies

    "next": "^14.2.15",
    "react": "^18",
    "react-dom": "^18.3.1",
    "@azure/identity": "^4.4.0",
    "@azure/msal-browser": "^3.26.1",
    "@azure/msal-react": "^2.1.1",
    "microsoft-cognitiveservices-speech-sdk": "^1.40.0"

layout.js

'use client'

import { PublicClientApplication } from "@azure/msal-browser";
import { MsalProvider, MsalAuthenticationTemplate, useMsal } from "@azure/msal-react";
import { InteractionType } from "@azure/msal-browser";

export default function RootLayout({ children }) {

  const msalConfig = {
    auth: {
      clientId: "APPLICATION ID",
      authority: "https://login.microsoftonline.com/<TENANT ID>",
      redirectUri: 'http://localhost:3000',
    },
    cache: {
      cacheLocation: "localStorage",
      storeAuthStateInCookie: false,
    },
  };
  
  const msalInstance = new PublicClientApplication(msalConfig);

  return (
    <html lang="en">
      <body>
        <MsalProvider instance={msalInstance}>
          <MsalAuthenticationTemplate interactionType={InteractionType.Redirect}>
            {children}
          </MsalAuthenticationTemplate>
        </MsalProvider>
      </body>
    </html>
  );
}

page.js

'use client'

import { useMsal } from "@azure/msal-react"
import * as sdk from "microsoft-cognitiveservices-speech-sdk"
import { useEffect, useState } from "react"


export default function Home() {
  const { instance } = useMsal()
  const [error, setError] = useState(null)

  const region = "westeurope"
  const resourceId = "<RESOURCE ID>"

  const tokenRequest = {
      scopes: ["https://cognitiveservices.azure.com/.default"]
  };

  const fetchToken = async () => {
    try {
        const account = instance.getAllAccounts()[0];

        const response = await instance.acquireTokenSilent({
            ...tokenRequest,
            account,
        })
        return response.accessToken
    } catch (error) {
        if (error) {
            const response = await instance.acquireTokenPopup(tokenRequest)
            return response.accessToken
        }
        throw error
    }
  };

  useEffect(() => {
    fetchToken().then(token => 
    {
      const authToken = "aad#" + resourceId + "#" + token
      const speechConfig = sdk.SpeechConfig.fromAuthorizationToken(authToken, region)

      const synthesizer = new sdk.SpeechSynthesizer(
        speechConfig
      )

      synthesizer.speakTextAsync(
        "Hello, world!",
        function (result) {
          if (result.reason === sdk.ResultReason.SynthesizingAudioCompleted) {
            console.log("synthesis finished.")
          } else {
            setError(result.errorDetails)
            console.error("Speech synthesis canceled, " + result.errorDetails)
          }})
      synthesizer.close();
    })
  }, [instance])


  return (
    <div>
      Error: {error}
    </div>
  );
}

Expected behavior

speakTextAsync shouldn't throw an error

Version of the Cognitive Services Speech SDK

1.41.0

Platform, Operating System, and Programming Language

Platform: Next.js, Browser
Operating System: Windows 11
Programming Language: Javascript

Additional context
Error occuring in: WebsocketMessageAdapter.js

WebSocket connection to 'wss://westeurope.tts.speech.microsoft.com/cognitiveservices/websocket/v1?Authorization=Bearer%20aad%23<RESSOURCEID>%23<TOKEN>&X-ConnectionId=<SOMEID>' failed: 
open @ WebsocketMessageAdapter.js:73
open @ WebsocketConnection.js:68
eval @ SynthesisAdapterBase.js:295

Error occuring in: page.js

Speech synthesis canceled, Unable to contact server. StatusCode: 1006,
                    wss://westeurope.tts.speech.microsoft.com/cognitiveservices/websocket/v1 Reason:  undefined

would be super grateful for any help!

Copy link

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.

@github-actions github-actions bot added the update needed For items that are in progress but have not been updated label Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
update needed For items that are in progress but have not been updated
Projects
None yet
Development

No branches or pull requests

1 participant