Source: Finance Derivative
Author: Aleksandras Šulženko, Product Owner at Oxylabs
Pricing intelligence rests on two foundational principles: product and price matching. Extracting the latter data is mostly relatively easy. It usually is easily searchable through the HTML file, making web scraping perfect at picking up price data at scale.
Product matching is where the process becomes complicated. At first glance, it may not seem difficult at all. Simply match the titles across websites, and you’re done. Unfortunately, such an approach would work for a few percentage points of products out of the entire ecommerce industry.
There’s no industry standard on how to create product titles. Additionally, on large user-generated marketplaces, SEO and other marketing considerations might come into play, making it even more challenging to find a perfect match.
Current solutions to product matching
As dynamic pricing is such a popular and important part of ecommerce, several solutions have emerged to tackle the problem. None of them, however, provide a foolproof detection method and are usually used in conjunction.
UPC, EAN, and GTIN comparisons are the most effective by far. They would be almost completely foolproof if not for the fact that few retailers ever publish them. Matching them is preferred to most other methods, but expectations are often shattered due to the scarce availability of such data.
Scraping product specifications such as dimensions, models, production dates, etc. These values are usually static across many retailers as they come from manufacturers and cannot be changed. Slight issues arise as the structure in the way the specifications are displayed isn’t equal across retailers. Additionally, some of them might not list all of the same details.
Finally, there’s the possibility of producing logic trees. Descriptive features (e.g., phone) extracted from categories can be continually matched by other important aspects to create a logic tree (e.g., phone -> iPhone -> iPhone 12 -> iPhone 12 256GB, etc.).
Logic trees greatly reduce the likelihood of false positives but have the drawback of providing fairly few true positives. So, in the end, all methods are usually combined to maximize the probability of matching products.
An understudied area of ecommerce analytics is object recognition. AI of this sort made rounds online about a decade ago as it could separate cats from dogs, the internet’s favorite image source. Since then, significant strides have been made in the development of AI object recognition.
It could have its fair uses in product matching for ecommerce. Most retailers are heavily invested in high-quality images (or, in some cases, required to provide them) with clearly stated branding. A fair part of boxed products will have the product’s name listed on them with some potential for a description.
Machine learning models can derive fairly accurate descriptions of objects without any additional descriptors. Fine-tuned ones would be able separate objects out of categories, and ones dedicated to specific categories would be able to differentiate between objects within them.
Since most products, however, have descriptors added to the packaging or images, such as the aforementioned titles or other words, those can be also extracted. Marketing practices state that essential, differentiating information should be displayed most prominently, allowing a machine learning model to bypass other data collection methods.
Although there are some caveats, most prominently, not all products can be differentiated purely by image. For example, iPhone versions could be detected, but it’s impossible to extract in-built storage capabilities (i.e., 256 GB vs 512 GB) out of the image. Therefore, in some cases, other sources will have to be used.
Additionally, some products may be extremely similar between themselves (such as some of the IKEA range), which, even with a well-trained and adapted machine learning model, may be hard to detect outright.
There are some inherent benefits in ecommerce product object recognition. Retailers have the incentive to create crisp quality images that clearly showcase specific products as it improves conversion rates.
In many cases, recognition deals with images of highly variable quality, angles, and primary object visibility. In ecommerce, many of these issues will be less prevalent due to the reasons outlined above.
Yet, there are still plenty of reasons to improve accuracy, as every percentage point will have a compounding effect in the long run. One of the options is to collect more data, which always works, however, it’s not the only way.
Data augmentation, the practice of tinkering with existing data to create new points, is perfectly suited for object recognition. Unlike text-based and numerical data, images can have nearly an infinite number of small changes while retaining the original intention.
Common examples of data augmentation include object occlusion, using photometric or geometric distortion (i.e., changing brightness, cropping, etc.), and superimposing two or more images on top of each other.
Object occlusion has shown promising results in making models more accurate at making predictions. The running theory is that by occluding certain parts of an object, the model is forced to focus on other parts to make a prediction, eliminating some possible skew.
Outside of object recognition practices, the model can be integrated with existing product matching systems. Each prediction can be matched in with an existing product in the database to see whether the specifications and other details also align.
For example, a prediction about a specific type of iPhone might turn out to be erroneous because the dimensions of that model don’t add up. In other words, there are some “hard facts” about products that never change. They can be used to hedge predictions to ensure that the system comes up with a higher level of accuracy.
So, it may seem like simply another method that could be viable at detecting and matching products. Yet, there is something important in relation to machine learning and web scraping.
Web scraping builds models
One of the hardest parts, if not truly the most complicated, is getting all the data that’s needed for a machine learning model. Typically, you’d have to scrape thousands of pages, label data, and keep feeding it into the algorithm.
But wherever pricing intelligence is already in place, the data is readily available. All the other methods of product matching rely on procuring the data that could be easily used to build a machine learning model.
As such, the resource costs associated with creating one are minimized. There still would have to be some sort of labeling involved, however, even that could be automated. After all, the products are already matched, so the desired output is known.
Since web scraping almost always downloads the entire HTML and parses it through to deliver the necessary data, downloading images to feed into the algorithm isn’t much of a change to the regular course of action. One word of caution, however, is that image delivery would greatly increase traffic costs for proxies, which can affect overall operation costs.
Therefore, the supplementary model is almost already available. Most of the hard work required to create one is done by the requirements of pricing intelligence. As such, gathering a dataset for implementing object recognition for price and product matching is much simpler than it may seem at first glance.
Product matching is likely one of the most complicated tasks allotted to ecommerce analytics. While rarely used, object recognition is one way to increase the likelihood of true positive detection.
One question remains – how much should one trust the model’s output? Unfortunately, I believe there’s no easy decision as accuracy is dependent on so many factors that giving a blanket answer is meaningless.
How will regulations effect the open banking sector?
Source: Finance Derivative
Martin Hartley – Group CCO of emagine Consulting
Comments on the future of the open banking sector and how it will affect the UK market.
“The UK Open Banking Sector is still primarily driven by regulation. In my view, two of the major current regulations will remain at the forefront moving forward, namely the CMA (Competition and Markets Authority), which mandated the major banks to provide open banking access to authorised third-party providers, and PSD2 (Second Payment Services Directive), which set the standards for secure data sharing. Cybersecurity regulations will only increase in importance, as will Brexit-related changes as any divergence between UK and EU standards could impact open banking.
“Over the upcoming months, increased data sharing through open banking will add crucial pressures to cybersecurity, likely creating a surge in the sector once again.
“I expect ongoing scrutiny and efforts to enhance data protection measures, potentially leading to more stringent cybersecurity regulations being adopted by businesses. I expect to see more partnerships between traditional banks and FinTechs or consultancy firms as they collaborate to enhance cybersecurity or offer innovative services to plug the gap. Conversely, there could be consolidation within the FinTech industry as companies merge to gain market share.
“When it comes to the size of the business and how it is affected, history has shown us that there are certainly positives and negatives of being an SMB when responding to new regulations. On the positive side, they can leverage their agility and they will have a more personal relationship with their customers, potentially leading to a higher level of trust. However, SMBs may face challenges due to their limited budgets and resources. The larger firms will have much larger budgets, allowing them to have more advanced IT systems and IT security, making it easier for them to integrate APIs and develop the necessary infrastructure.
“The benefits of open banking are endless, and the UK Government is showing their forward-thinking mentality in exploring the idea of implementing the technology to streamline wider services. But, much like anything, there are always pros and cons.
“Open banking would simplify payments for public services, making transactions quicker and more convenient for everyone. As it relies on APIs and authentication protocols, open banking would make payments more secure for the public and it would allow access to digital payments for members of the public who have smartphones but possibly no bank accounts. For any digital implementation, it goes without saying that we need to be aware of the risk of cyber attacks and data breaches. These, combined with the exclusion of non-tech savvy individuals, could mean that certain members of the public may not embrace the change, which poses a risk. There is also the additional cost of providing the infrastructure and this will have to be managed carefully to avoid burdening the taxpayer.
“We have already seen digital transformations in areas such as the GOV.UK Pay System and there are two main indicators of the success of any digital implementation; adoption rates and incidents. There haven’t been any high profile incidents that have hit the headlines in recent times so that to me is a huge positive and provides a level of confidence. It would be interesting to see how many government departments and agencies have adopted GOV.UK Pay for their payment processing needs to understand the system’s usefulness and acceptance within the government. The government must be committed to continuous improvement and to ensure that the system continues to comply with regulations and consciously drives the adoption rate to hit at least 90% of government departments and agencies.
“A favourable regulatory environment will encourage more banks and third-party providers to participate in open banking initiatives, leading to growth in the UK market and positioning the nation as industry leaders.”
Advancing green mobility for a sustainable future
Accelerating decarbonisation, the transition to SDVs and reshaping urban ecosystems, are helping revolutionise the global automotive industry
By Amit Chadha, CEO & Managing Director, L&T Technology Services
The world is changing. There is an urgent need for a transition toward sustainable practices to combat the threat of climate change. As global temperatures rise and weather patterns evolve, achieving net-zero emissions by 2050 could still help prevent irreversible damage to our planet.
With global carbon emission levels continuing to rise at an accelerated rate, there is a growing momentum toward addressing the scenario on war footing. As the most visible source of emissions, the automotive industry, and, consequently, the future of mobility, is in focus. By helping accelerate decarbonisation, reshape evolving urban ecosystems, and redefine the global automotive industry – we can help reverse the trend and preserve our shared future.
Green mobility has emerged as a major enabler in this direction. Leading stakeholders are becoming increasingly invested in developing a deeper understanding of the multifaceted realm of green mobility and its potential to shape a sustainable future.
Accelerating decarbonisation: A global mandate
Decarbonising the transportation sector is crucial to mitigate the harmful effects of climate change. Fossil fuel-based vehicles are responsible for a substantial portion of carbon dioxide emissions, exacerbating the greenhouse effect. To accelerate decarbonisation, governments and businesses today need to prioritise the adoption of clean, renewable energy sources, such as electricity and hydrogen, for powering vehicles and other modes of public transportation.
Automakers, recovering from the impact of the pandemic and global supply chain disruptions, are therefore exploring new avenues to meet the rising demand for electric mobility. Electric vehicles (EVs), by eliminating the need for fossil fuel-powered engines, play a vital role in improving overall air quality and have emerged as a promising solution for reducing carbon emission levels. They are capable of meeting the diverse needs of all kinds of drivers and offer affordable mobility and maintenance options. Recent advancements in battery technology, including the growing availability of charging infrastructure and incentives for adoption, have led to a significant rise in the EVs popularity.
However, to achieve widespread adoption of electric vehicles, there is a need to address key issues such as battery disposal, supply chain sustainability, and equitable access to EV technology.
Reshaping urban ecosystems: Driving the frontiers of change
Urban areas are central to the momentum around green mobility transformation. As growing global populations gravitate towards cities – congestion, pollution, and limited availability of green spaces have emerged as major challenges. As a result, cities must increasingly reinvent themselves to promote sustainable mobility and improve the quality of life for their residents.
Smart technologies and vertical green systems can contribute to a reduction in the energy demands of buildings by providing shade and insulation, mitigating urban heat islands, and cooling down public spaces. They also enable carbon sequestration, a reduction in pollution levels, and improvements in biodiversity.
Implementing efficient transportation systems, such as buses and trains powered by clean energy, can further reduce individual vehicle usage, traffic congestion, and emissions. Pedestrian-friendly infrastructures, cycling lanes, and micro-mobility solutions like e-scooters and bike-sharing programs can further help promote eco-friendly transportation choices. At a macro-infra level, smart city technologies and data-driven urban planning practices are helping optimise traffic flow, reduce idling times, and minimise fuel consumption.
Integrating green mobility into urban ecosystems is therefore a win-win proposition – fostering cleaner air, enhanced mobility options, and healthier communities.
From a public health perspective, improved air quality can drive a decline in respiratory and cardiovascular diseases linked to air pollution. Healthier citizens translate to a more productive workforce and reduced healthcare costs, further strengthening the growing impetus for vehicle electrification. The shift towards vehicle electrification offers significant economic benefits, including greater job creation, enhanced research and development, and greater investments in sustainable innovations. A consequent reduction in the demand for fossil fuels, scarce in terms of availability and mostly imported, in turn, helps enhance energy security and stabilise fuel prices.
Software Defined Vehicles: Pioneering the change
The global automotive industry is at the core of driving the emerging frontiers of green mobility. Traditional automakers and new entrants are racing to produce eco-friendly vehicles, and this competitive spirit, in turn, is transforming the industry landscape.
Automakers worldwide need to embrace sustainable practices by reducing their carbon footprint during the production process and implementing circular economy principles. Moreover, investing in research and development of alternative materials and manufacturing processes can lead to lighter, more energy-efficient vehicles. The rise of autonomous vehicles presents an opportunity to optimise transportation networks, enhance traffic flow, and reduce accidents. Leveraging this technology, in combination with electric and shared mobility solutions, can lead to a more sustainable and efficient future for transportation.
Software would play a key role in this direction, delivering a streamlined passenger and driver experience paradigm while ensuring conformity with the evolving regulatory standards. With Software Defined Vehicles (SDVs) increasingly constituting a focus area for major automakers worldwide, the future would witness a greater demand for digital engineering services to unlock new value streams.
The importance of ecosystem partnerships
Automotive industry stakeholders are already working with ER&D partners who can deliver across the value chain and understand each of the key parameters in the EV/SDV ecosystem. However, approaching separate vendors for product conceptualisation, design and development, testing, maintenance, manufacturing and after-sales support can increase costs and complexities.
An ER&D partner, equipped with multi-industry expertise, digital engineering capabilities, and a co-innovation commitment, can help drive transformation initiatives for transportation enterprises, overcoming technology constraints with cross-vertical learnings. Leveraging global delivery capabilities, the partner can also provide computing models that consume less energy, boost performance, and optimise data-led algorithms. In addition, they can enable scalable software stacks that leverage sensors and physical components to provide the safety and performance that electric vehicles need.
ER&D companies are also increasingly being called upon to help redefine focus areas with software, ensuring third-party integration, driving feature deployment, enabling CloudOps and fast over-the-air updates. The rising complexities within the connected car landscape further call for adopting software-defined designs that can overcome multi-layered challenges – ranging from development to subsequent deployment, maintenance, and updates.
A multi-stakeholder approach
Achieving the goal of green mobility demands collaboration among various stakeholders. Governments play a crucial role in enacting policies and regulations that incentivise the adoption of sustainable practices and technologies. Subsidies for EVs, emission standards, and urban planning regulations are some of the ways governments can drive the transition towards greener mobility.
Private sector involvement is equally critical. Corporate sustainability initiatives, investment in research and development, and partnerships for innovative mobility solutions can accelerate the transformation. Additionally, consumer awareness and support for eco-friendly practices are essential in shaping market demands and influencing business decisions.
Advancing green mobility is a pivotal step towards a sustainable future. By accelerating decarbonisation, embracing the transition to SDvs, reshaping urban ecosystems, and revolutionsing the automotive industry, this can combat climate change on a significant battleground. The collective efforts of governments, industries, and individuals are crucial in driving this transformation.
Embracing green mobility is therefore not just about reducing emissions, but rather, about fostering a healthier, cleaner, and more resilient world. It is about our common future –striving together toward a prosperous, inclusive, and sustainable tomorrow.
How Turning Your Core Data into a Product Drives Business Impact
By Venki Subramanian, SVP of Product Management at Reltio
Data drives efficiencies, improves customer experience, enables companies to identify and manage risks, and helps everyone from human resources to sales make informed decisions. It is the lifeblood of most organisations today. Sometime during the last few years, however, organisations turned a corner from embracing data to fearing it as the volume spiralled out of control. By 2025, for example, it is estimated that the world will produce 463 exabytes of data daily compared to 3 exabytes a decade ago.
Too much enterprise data is locked up, inaccessible, and tucked away inside monolithic, centralised data lakes, lake houses, and warehouses. Since almost every aspect of a business relies on data to make decisions, accessing high-quality data promptly and consistently is crucial for success. But finding it and putting it to use is often easier said than done.
That’s why many organisations are turning to “distributed data” and creating “data products” to solve these challenges, especially for core data, which is any business’s most valuable data asset. Core data or master data refers to the foundational datasets that are used by most business processes and fall into four major categories – organisations, people (individuals), locations, and products. A data product is a reusable dataset used by analysts or business users for specific needs. Most organisations are undergoing massive digital and cloud transformations. Putting high-quality core data at the centre of these transformations—and treating it as a product can yield a significant return on investment.
Customer data is one example of core or master data that firms rely on to generate outstanding customer experiences and accelerate growth by providing better products and services to consumers. However, leveraging core customer data becomes extremely challenging without timely, efficient access. The data is often trapped inside monolithic, centralised data storage systems. This can result in incomplete, inaccurate, or duplicative information. Once hailed as the saviour to the data storage and management challenge, monolithic systems escalate these problems as the volume of data expands and the urgent need for making data-driven decisions rises.
The traditional approaches for addressing data challenges entail extracting the data from the system of records and moving it to different data platforms, such as operational data stores, data lakes, or data warehouses, before generating use case-specific views or data sets. In addition, because of the creation of use case-specific data sets that are subsequently exploited by use case-specific technologies, the overall inefficiency of this process increases.
One inefficiency arises from the complexity of such a landscape, which involves the movement of data from many sources to various data platforms, the creation of use case-specific data sets, and the use of multiple technologies for consumption. Core data for each domain, such as customer, is duplicated and reworked or repackaged for almost every use case instead of producing a consistent representation of the data used across various use cases and consumption models – analytical, operational, and real-time.
There’s also a disconnect between data ownership and the subject matter experts that need it for decision-making. Data stewards and scientists understand how to access data, move it around and create models. But they’re often unfamiliar with the specific use cases in the business. In other words, they’re experts in data modelling, not finance, human resources, sales, product management, or marketing. They’re not domain experts and may not understand the information needed for specific use cases, leading to frustration and data going unused. It’s estimated, for example, that 20% or fewer of data models created by data scientists are deployed.
Distributed Data Architecture – An Elegant Solution to a Messy Problem
The broken promises of monolithic, centralised data storage have led to the emergence of a new approach called “distributed” data architectures, such as data fabric and data mesh. A data mesh can create a pipeline of domain-specific data sets, including core data, and deliver it promptly from its source to consuming systems, subject matter experts, and end users.
These data architectures have arisen as a viable solution for the issues created by inaccessible data locked away in siloed systems or rigid monolithic data architectures of the past. Data fabric decentralises the management and governance of data sets. It follows four core principles – domain ownership of data, treating data as a product and applying product principles to data, enabling a self-serve data infrastructure, and ensuring federated governance. These help data product owners create data products based on the needs of various data consumers and for data consumers to learn what data products are available and how to access and use these. Data quality, observability, and self-service capabilities for discovering data and metadata are built into these data products.
The rise of the concept of data products is helpful for analytics/artificial intelligence, and general business uses. The concept for either case is the same – the dataset can be reused without a major investment in time or resources. It can dramatically reduce the amount of time spent finding and fixing data. Data products can also be updated regularly, keeping them fresh and relevant. Some legacy companies have reported increased revenues or cost savings of over $100 million.
Data product owners have to create data products for core data to enable its activation for key initiatives and support various consumption models in a self-serve manner. The typical pattern that all these data pipelines enable can be summarised into the following three stages – collect, unify, and activate.
The process starts with identifying the core data sets – data domains like customer or product – and defining a unified data model for these. Then, data product owners need to identify the first-party data sources and the critical third-party data sets used to enrich the data. This data is assembled, unified, enriched, and provided to various consumers via APIs so that the data can be activated for various initiatives. Product principles such as the ability to consume these data products in a self-service manner, customise the base product for various usage scenarios, and deliver regular enhancements to the data are built into such data products.
Data product owners can use this framework to map out key company initiatives, identify the most critical data domains, identify the features (data attributes, relationships, etc.) and the sources of data – first and third party that needs to be assembled – to create a roadmap of data products and align them to business impact and value delivered.
With data coming from potentially hundreds of applications and the constantly evolving requirements of data consumers, poor quality data and slow and rigid architecture can cost companies in many ways, from lost business opportunities to regulatory fines to reputational risk from poor customer experience. That’s why organisations of all sizes and types need a modern, cloud-based master data management approach that can enable the creation of core data as products. A cloud-based MDM can reconcile data from hundreds of first and third-party sources and create a single trusted source of truth for an entire organisation. Treating core data as a product can help businesses drive value by treating it as a strategic asset and unlocking its immense potential to drive business impact.