Platforms that let shoppers virtually try on cosmetics, apparel, and accessories have exploded in popularity over the past decade, and it’s easy to see why. According to a survey conducted by banking company Klarna, 29% of shoppers prefer to browse for items online before actually buying them, while 49% are interested in solutions that take their measurements so they can be sure something will fit before buying.
With this top of mind, a team of researchers hailing from Adobe, the Indian Institute of Technology, and Stanford University explored what they describe as an “image-based virtual try-on” for fashion. It’s called SieveNet, and it’s able to retain the characteristics of clothing (including wrinkles and folds) as it maps it to virtual bodies without introducing blurry or bleeding textures.
SieveNet’s objective is to generate a new image from two images — (1) clothing and (2) a body model image — such that in the final image, the target model is wearing the clothing while the original body shape, pose, and other details are preserved. To accomplish this, it incorporates a multi-stage technique that involves warping a garment to align with the body model’s pose and shape before transferring the warped texture onto said model.
This warping, the authors of a paper detailing their work note, requires accounting for variations in shape or pose between the image of clothing, as well as occlusions in the model image (for example, long hair and crossed arms). Specialized modules within SieveNet predict coarse-level transformations and fine-level corrections on top of the earlier coarse transformations, while another module computes the rendered image and a mask atop the body model to which to copy the clothing.
In experiments, using four NVIDIA 1080Ti graphics cards on a PC with 16GB of RAM, the researchers trained SieveNet’s on a data set consisting of around 19,000 images of front-facing female models and upper-clothing product images. They report that in qualitative tests, the system handled occlusion, variation in poses, bleeding, geometric warping, and overall quality preservation better than baselines and that it was state-of-the-art across qualitative metrics including Fréchet Inception Distance (FID), which takes photos from both the target distribution and the system being evaluated (in this case SieveNet) and uses an AI object recognition system to capture important features and suss out similarities.
SieveNet isn’t the first of its kind, exactly. L’Oréal’s ModiFace, which recently came to Amazon’s mobile app, lets customers test out different shades of lipstick on live pics and videos of themselves. Startup Vue.ai‘s AI system susses out clothing characteristics and learns to produce realistic poses, skin colors, and other features, generating model images in every size up to five times faster than a traditional photo shoot. And both Gucci and Nike offer apps that allow people to virtually try on shoes.
But the researchers assert that a system like SeiveNet could be more easily incorporated into existing apps and websites. “Virtual try-on — the visualization of fashion products in a personalized setting — is especially important for online fashion commerce because it compensates for the lack of a direct physical experience of in-store shopping,” they wrote. “We show significant … improvement[s] over the current state-of-the-art methods for image-based virtual try-on.”