American technology and software giant Microsoft has launched a version of its “Seeing AI” app, designed to assist blind and visually impaired users, for Android devices. Microsoft indicated that the app is available for download from the Play Store and supports 18 languages, with plans to extend support to 36 languages next year.
The “Seeing AI” app audibly describes the surroundings and objects near the user, aiming to aid blind and visually impaired individuals in tasks such as reading emails, identifying products, and hearing descriptions of pictures.
Users can point their smartphone camera in any direction, and the app will describe the content captured in the image. The technology-focused website “CNET.com” noted that the app includes various categories for different tasks.
For instance, the “Short Text” feature reads text aloud as soon as it is placed in front of the camera. The “People” feature identifies individuals around the user when their faces are captured by the camera. The “Currency” function helps users recognize banknotes they are handling.
The “Scenes” feature allows users to hear a description of a photographed location. Users can move their finger across the screen while viewing an image to understand the placement of different elements within it. Additionally, the Seeing AI app can read handwritten texts and identify colors.
It’s worth noting that the “Seeing AI” app is currently available for iPhones and iPads running on the iOS operating system.
Meanwhile, both the Android and iOS versions contain updated features, such as more detailed image descriptions and the option to ask questions about any scanned document, like the contents of a menu or the price of an item on a bill. The app can also summarize articles upon request.
Microsoft has been at the forefront of AI advancements, demonstrating a commitment to both pushing the boundaries of AI technology and ensuring its responsible use. In 2022, Microsoft Research made significant strides in developing powerful large-scale AI models, notably in language understanding and computer vision. The Turing Universal Language Representation model and the Project Florence-VL team have achieved groundbreaking results in these fields, with models like NUWA-Infinity capable of generating high-resolution images and long-duration videos from text, images, and videos.
Furthering its AI innovation, Microsoft has integrated ChatGPT capabilities into tools for various purposes, such as search, collaboration, and learning, signifying a deeper dive into AI transformation. Microsoft’s approach to cloud infrastructure has been rethought to optimize AI performance, leading to the development of new AI-optimized chips like Azure Maia and Azure Cobalt. These innovations are designed to enhance AI workloads, including those used in OpenAI models, Bing, GitHub Copilot, and ChatGPT




