How to make AI see what matters to your business
There are so many great use cases for AI and digital asset management. And this has got to be one of our favorites. Integrating AI face and object detection into your DAM enriches metadata, improves organic and semantic search, and assists with deduplication.
If your business has a large amount of visual data to process and organize, it’s a game-changer.
Where AI visual recognition has been useful for broadly identifying an entire image — and can aid with basic descriptions and accessibility compliance — AI object detection has the ability to pinpoint details within an image.
Here, we’ll mostly look (no pun intended) at strategies to enable AI object detection in your DAM, and how to get specific with AI that sees your company’s leaders, products, and other important brand assets. If you’re more curious about the nuances of object detection versus recognition, you can check out this excellent overview from our partners over at Cloudinary.
How specific can AI face and object detection get?
Sometimes an iconic example can help show the difference between generic and highly specific AI. So, let’s take a look at this vintage photo of Superman to help us understand the power of AI object detection and AI face detection in your DAM.
“It’s a bird, it’s a plane, no, it’s Superman!”
When correctly enabled, AI object detection can identify Superman as portrayed by Kirk Alyn.
Basic visual recognition or generic AI would likely tag this photo as a man. It might flag that he has a distinctive mark on his clothing and, depending on the implementation, might describe his facial expression.
AI object detection with some context and instructions might correctly identify Superman.
AI object detection that’s been trained and fine-tuned on a DC knowledge base would tag it more comprehensively as Superman, the actor Kirk Alyn, the 1940s, and identify the iteration of the Superman Symbol. If he were in front of a specific building or a notable location, it would pinpoint that. And you would be able to search the DAM under any of those terms.
Three strategies for AI enablement in your DAM
So far, so compelling. But what are the best paths to AI enablement for DAM? The right one for your organization will depend on privacy requirements, budget, and the level of specificity you require.
- Custom on-premises/Cloud: Build your own AI infrastructure by hosting your own servers, buying your own GPUs, and building and fine-tuning your own computer vision model.
- Cloud hyperscalers: Use ready-to-go API endpoints and computer vision models from providers like Microsoft Azure and Amazon Rekognition, then add custom labels to make it easy to find specific people in your images.
- Hybrid open source: Host an open source, real-time object detection model like YOLO on your own infrastructure as a microservice. In addition to your knowledge base and RAG (retrieval-augmented generation), control the model tuning to meet your requirements
Since we’re both DAM agnostic and AI agnostic here at iSoftStone, our AI enablement and content services teams can help you with any of these options. (We even have an off-the-shelf AI infrastructure solution to make Strategy 1 easier!) However, many of our clients gravitate towards the last two options.
This is partly due to cost considerations and partly to flexibility and scale.
Clients with edge use cases opt for open-source computer vision
Custom on-premises offers complete privacy and control of your AI. That said, it’s definitely not one-size-fits-all. Because of the significant costs involved, it’s often a more appealing option for large enterprises in sectors with global reach. Sectors such as fintech and insurance, where privacy and data protection are primary concerns.
Cloud hyperscalers are fast to implement and (obviously) grow easily with your business. They come with the assured quality of providers like AWS and Microsoft. Many will charge per asset or per processing unit, so your systems integrator will need a careful implementation plan for how requests are run. With fine-tuning out of your control, they’ll also need to ensure the prompts are tight and the knowledge base is thorough enough to get quality results.
Hybrid open-source offers cost-effective flexibility for organizations that require high specificity and may have a few edge use cases. Going open source allows you to tune the computer vision model and train it if hyperscaler-trained models don’t meet your needs. You’ll be able to provide as much context as you need, as well as refine the prompts. This approach delivers granular specificity at a lower associated cost.
The use cases and benefits of automated tagging and AI object detection
Being able to auto populate tags and create metadata on ingestion saves many hours and makes everything in your DAM more findable and efficient. So does the ability to retroactively batch important visual data.
It’s an easy leap to think of use cases for enterprises or retailers with many thousands of SKUs. Imagine you want to search your image library for a compelling photo of your leader in a casual setting, maybe at a work party and where they’re laughing, or one with their spouse, or a photo of two senior team members in a group setting. AI face detection will help with that.
Maybe you need to search for every product in a particular color or find details in product images and specific logos for sub-brands. Or, say, you’re a multinational organization with distinctive office buildings you want to be sure you can showcase. AI object detection can make that happen.
It’s not just about searching in the DAM, either. You could use your product information database (PIM) to prompt the AI object detection, then send it back into the PIM before it gets exported downstream to the e-commerce platform.
We’re delivering exactly these kinds of solutions and features for our clients right now. Including for a Nuxeo DAM user with a global presence and vast visual library who opted to take the hybrid open-source approach.
AI-powered DAM for your unique business needs
The real value of an AI-powered DAM can only come from making it see what matters most to you and your organization. In visual intelligence, as with everything AI, specificity isn’t simply better — it’s everything.