Our client, an Ecommerce company, often receives product information from manufacturers in the form of PDF documents, which contain technical details written in long, unstructured paragraphs or tables. Before these products could be listed online, their team had to manually review each file to extract key attributes, such as size, weight, and material. This task was repetitive and time-consuming, and any sort of mistake made in data entry could lead to inaccurate product listings. As the number of products grew, it became harder for the teams to keep up with the workload while maintaining data accuracy and consistency across product pages.
To address this challenge, Rannsolve’s AI-powered Cognitive Data Extractor (CDE) was introduced to automate attribute extraction from product specification documents. Using a template-free parsing approach powered by machine learning, CDE intelligently segmented texts and extracted relevant product attributes. A semantic tagging engine, augmented with product-specific dictionaries, ensured that the identification and classification of key attributes were accurate across various product categories. This parsing strategy helped process thousands of specification sheets with minimal human intervention.
CDE delivered immediate and measurable results where attribute coverage reached over 90%, significantly improving the quality and completeness of product data. Product onboarding time was reduced by 70%, allowing Ecommerce teams to upload more than 30,000 SKUs per month. As a result, marketplace listings kicked into gear, improving both time-to-market and the overall user experience on product detail pages. This transformation not only streamlined their internal workflows but also gave them a competitive edge in the online retail space.
Comprehensive Solutions – Driven by AI Innovation – Transforming Business