Object Recognition for Price Matching
Source: Finance Derivative
Author: Aleksandras Šulženko, Product Owner at Oxylabs
Pricing intelligence rests on two foundational principles: product and price matching. Extracting the latter data is mostly relatively easy. It usually is easily searchable through the HTML file, making web scraping perfect at picking up price data at scale.
Product matching is where the process becomes complicated. At first glance, it may not seem difficult at all. Simply match the titles across websites, and you’re done. Unfortunately, such an approach would work for a few percentage points of products out of the entire ecommerce industry.
There’s no industry standard on how to create product titles. Additionally, on large user-generated marketplaces, SEO and other marketing considerations might come into play, making it even more challenging to find a perfect match.
Current solutions to product matching
As dynamic pricing is such a popular and important part of ecommerce, several solutions have emerged to tackle the problem. None of them, however, provide a foolproof detection method and are usually used in conjunction.
UPC, EAN, and GTIN comparisons are the most effective by far. They would be almost completely foolproof if not for the fact that few retailers ever publish them. Matching them is preferred to most other methods, but expectations are often shattered due to the scarce availability of such data.
Scraping product specifications such as dimensions, models, production dates, etc. These values are usually static across many retailers as they come from manufacturers and cannot be changed. Slight issues arise as the structure in the way the specifications are displayed isn’t equal across retailers. Additionally, some of them might not list all of the same details.
Finally, there’s the possibility of producing logic trees. Descriptive features (e.g., phone) extracted from categories can be continually matched by other important aspects to create a logic tree (e.g., phone -> iPhone -> iPhone 12 -> iPhone 12 256GB, etc.).
Logic trees greatly reduce the likelihood of false positives but have the drawback of providing fairly few true positives. So, in the end, all methods are usually combined to maximize the probability of matching products.
An understudied area of ecommerce analytics is object recognition. AI of this sort made rounds online about a decade ago as it could separate cats from dogs, the internet’s favorite image source. Since then, significant strides have been made in the development of AI object recognition.
It could have its fair uses in product matching for ecommerce. Most retailers are heavily invested in high-quality images (or, in some cases, required to provide them) with clearly stated branding. A fair part of boxed products will have the product’s name listed on them with some potential for a description.
Machine learning models can derive fairly accurate descriptions of objects without any additional descriptors. Fine-tuned ones would be able separate objects out of categories, and ones dedicated to specific categories would be able to differentiate between objects within them.
Since most products, however, have descriptors added to the packaging or images, such as the aforementioned titles or other words, those can be also extracted. Marketing practices state that essential, differentiating information should be displayed most prominently, allowing a machine learning model to bypass other data collection methods.
Although there are some caveats, most prominently, not all products can be differentiated purely by image. For example, iPhone versions could be detected, but it’s impossible to extract in-built storage capabilities (i.e., 256 GB vs 512 GB) out of the image. Therefore, in some cases, other sources will have to be used.
Additionally, some products may be extremely similar between themselves (such as some of the IKEA range), which, even with a well-trained and adapted machine learning model, may be hard to detect outright.
There are some inherent benefits in ecommerce product object recognition. Retailers have the incentive to create crisp quality images that clearly showcase specific products as it improves conversion rates.
In many cases, recognition deals with images of highly variable quality, angles, and primary object visibility. In ecommerce, many of these issues will be less prevalent due to the reasons outlined above.
Yet, there are still plenty of reasons to improve accuracy, as every percentage point will have a compounding effect in the long run. One of the options is to collect more data, which always works, however, it’s not the only way.
Data augmentation, the practice of tinkering with existing data to create new points, is perfectly suited for object recognition. Unlike text-based and numerical data, images can have nearly an infinite number of small changes while retaining the original intention.
Common examples of data augmentation include object occlusion, using photometric or geometric distortion (i.e., changing brightness, cropping, etc.), and superimposing two or more images on top of each other.
Object occlusion has shown promising results in making models more accurate at making predictions. The running theory is that by occluding certain parts of an object, the model is forced to focus on other parts to make a prediction, eliminating some possible skew.
Outside of object recognition practices, the model can be integrated with existing product matching systems. Each prediction can be matched in with an existing product in the database to see whether the specifications and other details also align.
For example, a prediction about a specific type of iPhone might turn out to be erroneous because the dimensions of that model don’t add up. In other words, there are some “hard facts” about products that never change. They can be used to hedge predictions to ensure that the system comes up with a higher level of accuracy.
So, it may seem like simply another method that could be viable at detecting and matching products. Yet, there is something important in relation to machine learning and web scraping.
Web scraping builds models
One of the hardest parts, if not truly the most complicated, is getting all the data that’s needed for a machine learning model. Typically, you’d have to scrape thousands of pages, label data, and keep feeding it into the algorithm.
But wherever pricing intelligence is already in place, the data is readily available. All the other methods of product matching rely on procuring the data that could be easily used to build a machine learning model.
As such, the resource costs associated with creating one are minimized. There still would have to be some sort of labeling involved, however, even that could be automated. After all, the products are already matched, so the desired output is known.
Since web scraping almost always downloads the entire HTML and parses it through to deliver the necessary data, downloading images to feed into the algorithm isn’t much of a change to the regular course of action. One word of caution, however, is that image delivery would greatly increase traffic costs for proxies, which can affect overall operation costs.
Therefore, the supplementary model is almost already available. Most of the hard work required to create one is done by the requirements of pricing intelligence. As such, gathering a dataset for implementing object recognition for price and product matching is much simpler than it may seem at first glance.
Product matching is likely one of the most complicated tasks allotted to ecommerce analytics. While rarely used, object recognition is one way to increase the likelihood of true positive detection.
One question remains – how much should one trust the model’s output? Unfortunately, I believe there’s no easy decision as accuracy is dependent on so many factors that giving a blanket answer is meaningless.
Enhancing cybersecurity in investment firms as new regulations come into force
Source: Finance Derivative
Christian Scott, COO/CISO at Gotham Security, an Abacus Group Company
The alternative investment industry is a prime target for cyber breaches. February’s ransomware attack on global financial software firm ION Group was a warning to the wider sector. Russia-linked LockBit Ransomware-as-a-Service (RaaS) affiliate hackers disrupted trading activities in international markets, with firms forced to fall back on expensive, inefficient, and potentially non-compliant manual reporting methods. Not only do attacks like these put critical business operations under threat, but firms also risk falling foul of regulations if they lack a sufficient incident response plan.
To ensure that firms protect client assets and keep pace with evolving challenges, the Securities and Exchange Commission (SEC) has proposed new cybersecurity requirements for registered advisors and funds. Codifying previous guidance into non-negotiable rules, these requirements will cover every aspect of the security lifecycle and the specific processes a firm implements, encompassing written policies and procedures, transparent governance records, and the timely disclosure of all material cybersecurity incidents to regulators and investors. Failure to comply with the rules could carry significant financial, legal, and national security implications.
The proposed SEC rules are expected to come into force in the coming months, following a notice and comment period. However, businesses should not drag their feet in making the necessary adjustments – the SEC has also introduced an extensive lookback period preceding the implementation of the rules, meaning that organisations should already be proving they are meeting these heightened demands.
For investment firms, regulatory developments such as these will help boost cyber resilience and client confidence in the safety of investments. However, with a clear expectation that firms should be well aligned to the requirements already, many will need to proactively step up their security oversight and strengthen their technologies, policies, end-user education, and incident response procedures. So, how can organisations prepare for enforcement and maintain compliance in a shifting regulatory landscape?
In today’s complex, fast-changing, and interconnected business environment, the alternative investment sector must continually take account of its evolving risk profile. Additionally, as more and more organisations shift towards more distributed and flexible ways of working, traditional protection perimeters are dissolving, rendering firms more vulnerable to cyber-attack.
As such, the new SEC rules provide firms with additional instruction around very specific prescriptive requirements. Organisations need to implement and maintain robust written policies and procedures that closely align with ground-level security issues and industry best practices, such as the NIST Cybersecurity framework. Firms must also be ready to gather and present evidence that proves they are following these watertight policies and procedures on a day-to-day basis. With much less room for ambiguity or assumption, the SEC will scrutinise security policies for detail on how a firm is dealing with cyber risks. Documentation must therefore include comprehensive coverage for business continuity planning and incident response.
As cyber risk management comes increasingly under the spotlight, firms need to ensure it is fully incorporated as a ‘business as usual’ process. This involves the continual tracking and categorisation of evolving vulnerabilities – not just from a technology perspective, but also from an administrative and physical standpoint. Regular risk assessments must include real-time threat and vulnerability management to detect, mitigate, and remediate cybersecurity risks.
Another crucial aspect of the new rules is the need to report any ‘material’ cybersecurity incidents to investors and regulators within a 48-hour timeframe – a small window for busy investment firms. Meeting this tight deadline will require firms to quickly pull data from many different sources, as the SEC will demand to know what happened, how the incident was addressed, and its specific impacts. Teams will need to be assembled well in advance, working together seamlessly to record, process, summarise, and report key information in a squeezed timeframe.
Funds and advisors will also need to provide prospective and current investors with updated disclosures on previously disclosed cybersecurity incidents over the past two fiscal years. With security leaders increasingly being held to account over lack of disclosure, failure to report incidents at board level could even be considered an act of fraud.
Organisations must now take proactive steps to prepare and respond effectively to these upcoming regulatory changes. Cybersecurity policies, incident response, and continuity plans need to be written up and closely aligned with business objectives. These policies and procedures should be backed up with robust evidence that shows organisations are actually following the documentation – firms need to prove it, not just say it. Carefully thought-out policies will also provide the foundation for organisations to evolve their posture as cyber threats escalate and regulatory demands change.
Robust cybersecurity risk assessments and continuous vulnerability management must also be in place. The first stage of mitigating a cyber risk is understanding the threat – and this requires in-depth real-time insights on how the attack surface is changing. Internal and external systems should be regularly scanned, and firms must integrate third-party and vendor risk assessments to identify any potential supply chain weaknesses.
Network and cloud penetration testing is another key tenet of compliance. By imitating how an attacker would exploit a vantage point, organisations can check for any weak spots in their strategy before malicious actors attempt to gain an advantage. Due to the rise of ransomware, phishing, and other sophisticated cyber threats, social engineering testing should be conducted alongside conventional penetration testing to cover every attack vector.
It must also be remembered that security and compliance is the responsibility of every person in the organisation. End-user education is a necessity as regulations evolve, as is multi-layered training exercises. This means bringing in immersive simulations, tabletop exercises and real-world examples of security incidents to inform employees of the potential risks and the role they play in protecting the company.
To successfully navigate the SEC cybersecurity rules – and prepare for future regulatory changes – alternative investment firms must ensure that security is woven into every part of the business. They can do this by establishing robust written policies and adhesion, conducting regular penetration testing and vulnerability scanning, and ensuring the ongoing education and training of employees.
Gearing up for growth amid economic pressure: 10 top tips for maintaining control of IT costs
Source: Finance Derivative
By Dirk Martin, CEO and Founder of Serviceware
Three years on from the pandemic and economic pressure is continuing to mount more than ever. With the ongoing threat of a global recession looming, inflation rising, and supply chain disruption continuing to take its toll, cutting costs and optimizing budgets remains a top priority amongst the c-suite. Amid such turbulence, the Chief Financial Officer (CFO) and Chief Innovation Officer (CIO) stand firmly at the business’s helm, not only to steady the ship but to steer it into safer, more profitable waters. These vital roles have truly been pulled into the spotlight in recent years, with new hurdles and challenges being constantly thrown their way. This spring, for example, experts expect British businesses to face an energy-cost cliff edge as the winter support package set out by the government is replaced.
Whilst purse strings are being drawn ever tighter to overcome these obstacles, there is no denying that the digitalization and innovation spurred on by the pandemic are still gaining momentum. In fact, according to Gartner, four out of five CEOs are increasing digital technology investments to counter current economic pressures. Investing in a digital future, driven by technologies such as the Cloud, Artificial Intelligence (AI), Blockchains and the Internet of Things (IoT), however, comes at a cost and to be able to do so – funds must be released through effective optimization of existing assets.
With that in mind, and with the deluge of cost and vendor data descending on businesses who adopt these technologies, never has it been more important for CIOs and CFOs to have a complete, detailed and transparent view of all IT costs. In doing so, business leaders can not only identify the right investment areas but increase the performance of existing systems and technology to tackle the impact of spiralling running costs.
Follow the below 10 steps to gain a comprehensive, detailed and transparent overview of all IT costs to boost business performance and enable your IT to reach the next level.
1: Develop an extensive IT service and product catalogue
The development of an IT service and product catalogue is the most effective way to kick-start your cost-optimization journey. This catalogue should act as a precise overview of all individual IT services and what they entail to directly link IT service costs to IT service performance and value. By offering a clear set of standards as to what services are available and comprised of, consumers can gain an understanding of the costs and values of the IT services they deploy.
2: Monitor IT costs closely
By mastering the value chain, a concept that aims to visualise the flow of IT costs from its most basic singular units through to realised business units and capabilities, businesses can keep track of where IT costs stem from. With the help of service catalogues, benchmarks, the use of a cost model focussing on digital value in IT Financial Management (ITFM) or what is often referred to as Technology Business Management (TBM) solutions, comprehensive access to this data can be guaranteed, creating a ‘cost-to-service flow’ that identifies and controls the availability of IT costs.
3: Determine IT budget management
Knowledge of IT cost allocation is a vital factor when making informed spending decisions and adjustments to existing budgets. There are, however, different approaches that can be taken to this including – centralized, decentralized and iterative. A centralized approach means that the budget is determined in advance and distributed to operating cost centres and projects in a top-down process, allowing for easy, tight budget allocation. A decentralized approach reverses this process – operating costs are precisely calculated before budgeting and projects are determined. Both approaches come with their own risks, for centralized overlooking projects that offer potential growth opportunities and for decentralized budget demands that might exceed available resources.
The iterative approach tries to unify both methods. Although the most lucrative approach, it also requires the most resources. So, the chosen approach is very much dependent on the available resources, and the enterprise’s structural organization.
4: Defining ‘run’ vs ‘grow’ costs
Before IT budget can be allocated, costs should be split into two distinct categories: running costs (i.e. operating costs) and costs for growing the business (i.e. products or services used to transform or grow the business). Once these categories have been defined, decisions should be made on how the budget should be split between them. A 70% run/30% grow split is fairly typical across most enterprises, but there is no one-size-fits-all approach, and this decision should be centred around the businesses’ overall strategies and end goals.
5: Ensuring investments result in a profit
By carrying out the aforementioned steps, complete transparency can be achieved over which products and services are offered, where IT costs stem from, and where budgets are allocated. From here, organizations can review how much of the IT budget is being used and where costs lead to profits and losses. By maintaining a positive profit margin, the controlling processes can be further optimized. If the profit margin is negative, appropriate, or timely, corrective measures can be initiated.
6: Staying on top of regulation
For a company that operates internationally (E.g. it markets IT products and services abroad), it is extremely important that it stays on top of country-specific compliance and adheres to varying international tax rules. To do so correctly it is necessary to provide correct transfer price documentation. This requires three factors:
- Transparent analysis and calculation of IT services based on the value chain
- Evaluation of the services used and the associated billing processes
- Access to the management of service contracts between providers and consumers as the legal basis for IT services.
7: Stay competitive
Closely linked to the profit mentioned in step five is the question of how to price IT services in order to stay competitive whilst avoiding losses. This begins with benchmark data which can be researched or determined using existing ITFM solutions that can automatically extract them from different – interconnected – databases. From there, a unit cost calculation can be used to define exactly and effectively what individual IT services – and their preliminary products – cost. This allows organizations to easily compare internal unit cost calculations with the benchmarks and competitor prices, before making pricing decisions.
8: Identify and maintain key cost drivers
Another aspect of IT cost control that is streamlined via the comprehensive assessment of the cost-to-service flow is the identification and management of main IT cost drivers. A properly modelled value chain makes it clear which IT services or associated preliminary products and cost centres incur the greatest costs and why. This analysis allows for concise adjustment to expenditure and helps to avoid misunderstandings about cost drivers. Using this as a basis, strategies can be developed to reduce IT costs effectively and determine a better use of expensive resources.
9: Showback/Chargeback IT costs
By controlling IT costs using the value chain, efficient usage-based billing and invoicing of IT services and products can be achieved. If IT costs are visualized transparently, they can easily be assigned to IT customers, therefore increasing the clarity of the billing process, and providing opportunities to analyze the value of IT in more detail. When informing managers and users about their consumption there are two options: either through the ‘showback’ process – highlighting the costs generated and how they are incurred – or through the ‘chargeback’ process, in which costs incurred are sent directly to customers and subcontractors.
10: Analyse supply vs. demand
By following the processes above, transparency regarding IT cost control is further extended and discussions around the value of IT services are made possible across the organization. A more holistic analysis of IT service consumption allows conclusions to be drawn promptly to enable the optimization of supply and demand for IT services in various business areas. This, in turn, will enable a more comprehensive value analysis and optimization of IT service utilization.
Following these 10 cost management steps, a secure, transparent, and sustainable IT cost control environment can be developed, resulting in fully optimized budgets and in turn – significant cost savings. Cost-cutting aside, automating the financial management process in such an environment can boost productivity substantially freeing up time to focus on valuable work, thus leading to overall business growth.
The business and economic landscape is full of uncertainty right now, but business leaders can regain control via cost management, not only to weather current storms but to set themselves up for success beyond today’s turbulence.
Banking on legacy – The risks posed by ‘stone age’ banking infrastructure
Source: Finance Derivative
By Andreas Wuchner, Angel Investor of Venari Security
If you consider the most significant motivating factors behind cyber-attacks – the promise of large financial reward and the opportunity to cause maximum business and social disruption – it’s little wonder that banks and financial institutions are amongst the most inviting targets for would-be cyber criminals. In fact, according to IBM’s recent report, ‘banking and finance’ was the most attacked industry for the five years between 2015 and 2020 – surpassed only by threats to critical infrastructure in recent years. Successful attacks can provide aggressors with a mass of sensitive personal and financial information, and even access to people’s money itself. Furthermore, a suspension of withdrawals and deposits can cause huge social disruption and reputational damage.
As banks have reacted to years of new regulation and emerging technologies, they often operate with a hugely complicated and disparate technology estates. This provides malicious actors with a wealth of potential attack vectors. A small breach from anywhere in this network can have enormous consequences, and lead to entire systems being overrun. As such, it’s crucial that security teams operate with the highest-grade security possible, including ensuring the strongest level of encryption standards. Banks need to look beyond regulatory tick-box commitments and ensure they are taking proactive and preventative steps to monitor and combat malicious attacks across their entire network.
However, the ability to react to cyber-threats across a vast estate requires speed and flexibility to quickly react and update security protocols. The sheer volume of legacy infrastructure slows this process down considerably leaving many security teams in a vicious cycle.
The threat of legacy infrastructure
A sizeable proportion of the banking industry still maintains a reliance on systems first developed more than 40 years ago. In fact, many ‘core banking’ systems, like payments, loans, mortgages and the associated technologies, are still coded using COBOL (Common Business-Orientated Language), an otherwise defunct programming language that is older than the internet itself. In the UK and Europe, COBOL remains the ‘backbone of banking services,’ while in the USA, as much as 43% of banking systems are built on COBOL, meaning it underpins much of our financial system.
This presents a huge security risk. While code has been regularly updated over the years, these systems were built when security threats were far less sophisticated, less well-financed and the burden of data was far less pronounced. For several years, governments have pointed towards legacy systems, built using COBOL, as a major cybersecurity threat, incompatible with modern security best practices and solutions, including multi-factor authentication. For example, data from Kaspersky found that businesses with outdated technology are much more likely to have suffered a data breach (65%) than those who keep their technology updated (29%).
A further security consideration is the diminishing number of people who are trained in maintaining COBOL systems. Every year, experienced professionals exit the industry, making it increasingly difficult to service legacy technologies and creating significant delays in patching threats once they’re identified. This lack of supply of sufficiently trained experts, and the demand they face, makes any updates extremely expensive and time consuming.
Furthermore, legacy infrastructure is preventing the secure application of encryption, posing its own distinct cybersecurity and regulatory risks. Encryption is often heralded as a silver bullet solution for data privacy and has been a continuing area of focus for regulatory bodies in recent years. However, banks remain guilty of poor deployment, maintenance and management of encryption – using outdated protocols and inefficient methods of analysing and understanding network traffic. This, coupled with legacy ‘core banking’ systems that are incompatible with modern encryption techniques, equates to a regulatory and security headache for security teams.
Adopting a new mindset
The risks posed by legacy systems and the volume of cybersecurity threats facing banks, mean a concentrated re-think of overall cybersecurity strategy is needed to prevent breaches and ensure data is protected long-term. Traditionally, banks have taken an ‘outside-in’ view – dedicating capacity, finances and knowledge to dealing with threats that are existing, known and well publicised. However, to aid long-term security, this should be superseded by an ‘inside-out’ proactive approach, whereby security teams are cognisant of their own internal systems and where the key vulnerabilities are found. Once banks have a detailed view of the security risks posed by their legacy systems, and specifically what data is threatened, they can address flaws, update these systems and build a stronger overall security posture.
The secure path ahead
Many of our successful high-street banks today have centuries of experience in dealing with social, economic and regulatory upheaval. However, the rapid development and deployment of technology continues to present a unique challenge. Many ‘traditional’ banks have built a complex technology infrastructure through decades of adjustment to new legislation and emerging technologies. While serviceable in the past, fintech start-ups are pushing the long-term viability of these systems to the limit.
Challenger banks have the luxury of being built from the ground-up, prioritising convenient digital services and features, and modern security processes. As the user base of these banks increase, customers are increasingly expecting these features and security from their existing banks, meaning even more complexity added to legacy infrastructures. As outlined by Deloitte, existing firms simply aren’t positioned to support the rising expectation of the market, exposing banks to additional risk and liability.
What’s more, it’s estimated that banks spend as much as 80% of their yearly IT budgets on the maintenance of legacy systems. While an immediate switch away from these systems is unrealistic, there is an opportunity to reduce wasted spend and divert spend towards modernisation efforts. However, while traditional banks may want to adapt quicker to technological advancements, they need to do so while continuing to minimise cyber risk and without jeopardising the security of their data or systems. This means placing cybersecurity at the heart of any modernisation efforts and maintaining a steady rate of change. As more of the technology estate begins to be modernised, the potential risks of regulatory non-compliance will also reduce.
Legacy systems need a considered update
Banking systems have heavily relied on legacy infrastructure for too long now, bringing difficulties in maintaining the highest-grade cybersecurity and in facilitating innovation. The risks presented by novel cybersecurity attack vectors and competition from new and emerging digital services offered by challenger banks are exacerbating these issues. As such, legacy systems need a managed modernisation in the long-term, facilitated in part by a managed redistribution of existing IT spend. However, to ensure long-term security overall, cybersecurity needs to be central to be at the very heart of modernisation efforts.