Startup: AssemblyAI Represents New Generation Speech Recognition

0 3 minutes read

By Trends AI staff

Progressing artificial intelligence behind the recognition of speech is progressing in the market, attracting investment capital and emerging financing, which constitutes challenges for existing players.

The increasing acceptance and the use of speech recognition devices lead the market, which is expected to reach an accurate estimate of minute research 26.8 billion dollars worldwide by 2025, according to a recent account in Seeing analyzes. The best speed and accuracy among the benefits of advanced technology.

In the midst of this new growth, ASSEBLYI OF SAN FARANCICO, an application interface to learn to learn to speak, are able to copy videos, podcasts, phone calls and remote meetings. The company was founded by CEO Dylan Fox in 2017 and received support from Y Combinator, which is a startup, as well as NVIDIA.

Fox has an extraordinary wallpaper for a high -tech businessman. He graduated from George Washington University with a certificate in business administration, business economics and public policy. He got a job as a software engineer in CISCO for the emerging product laboratory in San Francisco, and works on deep nervous networks and machine learning. He got an idea about ASSEMBLYAI and attracted capital from Y Combinator, which enabled him to employ data and data engineers to produce technology from the ground.

He asked in an interview with Artificial intelligence trends Fox said: “How did this move from the first university stage in business and economy management to high -tech businessman,” I have learned myself how to program, which prompted me to the path of machine learning. I was looking for a more difficult challenge, which led to the treatment of the natural language, which transferred me to CISCO. “They were working on the Apple Foundation Siri at that time,

To accelerate the work, Cisco was looking to get speech recognition programs; Fox was in the Catbird seat to search. “We looked at Nuance”, for example, admitted as a market leader and the owner of speech recognition programs more than their competitors. (Microsoft is expected to be completed by Microsoft for $ 19.6 billion by the end of the year). “It was crazy how bad all options of accuracy and developer’s point of view,” said.

TWILIO, a company based in San Francisco, which was established in 2008, which issued this year, has released a TWILO VOICE application program to make and receive phone calls hosted in the cloud. The company has since raised $ 103 million in investment capital. “They were setting new criteria for a good application programming interface for developers,” said Fox.

The idea of Fox was to use AI and machine learning to achieve “very accurate results, and make it easy for developers to merge the application programming interface in their products. One of the customers is Callrail, where the call tracking program offers calls and marketing analyzes, which plans to integrate the Assembyai app to gain an insight into the reason for communicating with people. Among the other customers NBC and Wall Street Journal, using the product to copy content and interviews, and provide closed illustration.

“We worked to build close to the quality of learning about human speech as possible. Fox said: It was a lot of work.” This plateau is expected to reach in 2022.

It aims to integrate to identify speech in their products and makes it easy to purchase. Pay customers on the basis of use; For every second of the copied sound, Assemblyai receives part of a penny. Customers get a monthly bills. If the customer uses 10 hours a month, it costs about nine dollars. If the customer uses a million hours per month, it costs about 900,000 dollars.

Voice recognition is a hot market. “Many new startups are launched,” said Fox. “Many interesting new companies are built on audio data.”

Assemblyai product can discover sensitive topics such as hate speech and profanity, so that customers can provide moderate human content.

“We are a team of researchers in the field of deep learning,” said Fox. “We are building very large and accurate educational models that have a much more accurate recognition results than traditional automatic learning approach. We really build large models using advanced nerve network technologies.” The approach of what Openai uses to develop the GPT-3 Grand Language model.

In addition, they build AI’s top versions, to provide summaries of audio and video content, which can be searched and index. “It goes beyond mere copies,” said Fox.

The company currently has 25 employees and is expected to double in about four months. The work was good. “There is an explosion of audio and video data online and they want customers to be able to benefit from it, so we see a lot of demand,” Fox said.

Learn more in Association.

[og_img]
2021-10-21 20:07:00

0 3 minutes read