| dc.description.abstract | introduce a thorough study and development of a Visual Product Search System for an E-Commerce Website using Python utilizing recent advances in deep learning. The platform allows the users to input pictures instead of text and thus achieves more friendly experiences, overcoming the limitations of traditional keyword-based search.
The methodology can be recapitulated as composed by two-step process: building-offline database, and searching-online. The offline process is executing the product images which get into background removal based on U²-Net architecture to trim out the dominant product, then extracting characteristics by pre-trained ResNet-50 model getting 2048-dimension feature vectors. These descriptors are cached in an efficient database form for quick access. In the online search phase, query images are fed into the same pipeline, followed by similarity matching with FAISS (Facebook AI Similarity Search) library under cosine similarity.
I performed intensive experimental validation on a dataset of more than 1,000 product images from seven different categories such as watches, headphones, shirts, pants, helmets, cutlery and water bottles. The system performed quite well with Top-1 retrieval accuracy of 89.76% over training data, 82.49% over the validation data and 82.69% on test queries was achieved. Top-5 accuracy was above 97% in all datasets, proving the system to be a robust one. I observed the Mean Average Precision (MAP) consistently more than 0.60, and the Mean Reciprocal Rank (MRR) above 0.88 for validation and test sets during this time period.
Some interesting results are that with background removal, the glean from search can effectively enhance search precision by mitigating noise and highlighting key product parts. Transfer learning from ImageNet-pretrained network, ResNet-50 was quite efficient, thus sparing the necessity of training on large domain-specific data. Category based analysis showed that categories with clear visual characteristics (watches, water bottles) were more accurately recognized than those with high intra-class variation (clothing).
To the best of my knowledge, this work is the first to propose a full VSP for e-commerce website applications. The contributions of this work are as follows: (1) I have presented an end-to-end visual search system specifically designed for e-commerce applications, (2) I have performed
ix | P a g e
extensive evaluation analysis in terms of not only retrieval accuracy results but also various metric such as recall-vs-enumeration time curve and its counterparts for each product category, (3) I have achieved actual implementation with web-based Frontend interface necessary for practical usage in real world scenario and, (4) I analyze how different factors affect the performance of VSP based on two diverse types of product domains.
For future work, I can explore more search modalities such as multi-model text-visual query combination, attention mechanism for fine-grained feature extraction, cross-domain transfer learning and mobile-efficient version for processing on-device. The system shows great commercialization value and has cleared the way for future visual search applications in e-commerce. | en_US |