We pride ourselves on working hand in hand with our clients to create the best solutions for their unique needs.
What do we do?
Ugiat Technologies is a company specialized in digital image/audio processing, computer vision, media content analysis and natural language understanding solutions. Thanks to our B2B model, multiple companies have made use of our services to reduce costs, automate their workflow and improve their efficiency. Contrary to most companies, Ugiat Technologies provides custom software and dedicated support to companies to solve their specific problems.
Image Classification, Segmentation and Captioning
Our technology is able to understand image content in several different manners. Image classification is useful for information retrieval and content categorization. For instance, visual sensitive content can be filtered by detecting terrorism, sex scenes and others relevant categories. Image segmentation and pattern recognition allows the automation of some manufacturing process, logos detection and recognition among others. For example, it can be applied to a quality process based on image inspection where manufactured components must be segmented and evaluated. Image captioning deals with the description of the image content to understand the action that represents the image. It is useful for monitoring some events as the detection of sign language interpretation.
Speech-To-Text, Closed Captioning and Text-To-Speech
Create automatic speech-to-text transcriptions from every audio and video format. We provide a hybrid-solution using off-the-shelf libraries and third party services to obtain the best speech transcription qualities. Main languages are spanish, catalan and english but it can be extended to more than 40 languages. Generate closed captioning using custom tools to enhance the punctuation and capitalization and export them to the most popular formats (srt, vtt, etc...). Synthesize speech from text using custom model voices or cloning the original voices. A service useful for voice over or dubbing video content to multiple languages.
Natural Language Processing
Understanding text and spoken words in much the same way human beings can. NLP is useful for: named entity recognition, sentiment analysis, natural language generation, speech tagging and more. Some advanced applications could be summarization, semantic segmenation of the video content and the detection of the most relevant topics. It is also useful for information retrieval for large databases.
Speaker Diarization and Face Recognition
Identify and find people in images, audio and videos. Find who speaks and when to monitor their presence/interventions, implement biometric applications using the facial or the speech recognition. These features allows the implementation of most advanced applications. For example, a dubbing system using specific voices for the different speakers.
Recent projects
Do you want to know more? Take a look at some of our projects:
Improve the accessibility of your videos by dubbing them into multiple languages using cloned voices.
Segmentation and Summarization of media content. Extract relevant topics and define the most suitable tags for a better indexing.
Smart video player with semantic segmentation, speaker recognition and advanced video search through the high-level metadata extracted.
Visualize and buy products in real time from your home. Digitalize and recognize all the products of a supermarket.
Maintain user attention by placing ads smartly by non-interrupting video scenes and using the most relevant ads concerning the video content.
Monitoring system of video content events. Generate reports of the target metrics to ensure your quality criteria.
Who already trust us
LET'S GET IN TOUCH!
Do no hesitate to contact us to test our solution or to share new ideas. Send us an email and we will get back to you as soon as possible!
Esta página web utiliza cookies para mejorar la experiencia de usuario.