Connect with us

Business

Web Scraping in 2022 & Beyond

Source: Finance Derivative

Web scraping has been coming into the limelight in recent years due to the rising interest in data. Businesses across the globe have been eyeing automated data collection as a way to enhance their profitability and overall decision making.

We’ve sat down with the Lead of Commercial Product Owners at Oxylabs.io, Nedas Višniauskas, to talk about the future of web scraping. Few people have been as deeply involved with the industry as Nedas, which has allowed him to gain a unique perspective on how it has developed and how it will continue to do so.

What do you think has been the biggest change in web scraping over the last decade? How has Oxylabs participated in these changes?

There have been some interesting changes during the past few years. One of them, I think, has been the proliferation of increasingly sophisticated anti-bot systems. Scraping such websites at scale, in turn, becomes more difficult.

Scraping enthusiasts, of course, have their own answer to these issues, which is to develop dedicated data collection tools. These, while limiting the field of use, can bypass the anti-bot systems and they are constantly being updated for that purpose.

Another important change has been the rising popularity of JavaScript. More and more websites are using it to load critically important data dynamically, which means it’s essentially unreachable without browsers.

Headless ones, therefore, are a necessity. At the same time, that means infrastructure costs are rising as headless browsers take up much more computing power and traffic than simple HTTP requests.

Finally, ethics have been in the limelight. For example, residential proxy providers are looking for ways to inform and reward participants of the network. We ourselves took charge of building the framework for ethical acquisition, which, I believe, has played a part in the fact that there are less shady practices and more clarity among all industry participants.

To answer the second question, Oxylabs have reacted to these changes with the development of Scraper APIs. We created both dedicated and universal scrapers that can acquire publicly available data from nearly any website without issue. Additionally, all of our proxies are ethically sourced, giving our partners the much needed peace of mind when engaging in scraping.

Have you seen or noticed any particular trends in data acquisition or web scraping? Are specific data types becoming popular?

Off the cuff I’d say that the use of ecommerce and delivery data has been booming since the pandemic hit. Businesses want to (legally) spy on competitors and gain access to as much data as possible. Data types like pricing, products or delivery times are important to any competitor.

But these have always been important. Maybe I would say that external data in general has risen in importance. Outside of that, I don’t think there have been any particular trends in data types. There have been, however, changes in the entire supply chain. As I’ve mentioned, businesses only really need the data. Even then, the data is not the key – insights are.

As such, businesses at the tail-end of the chain have proliferated in recent years. Data-as-a-service aggregators, ones that collect information and sell sets of it, have been rising in popularity.

There are also some businesses that provide insights directly. While these are still few and far between, some of them have unique value propositions that I could see as worthwhile. Jungle Scout, for example, is a service that both scrapes external data and has large datasets from internal sources. As such, they can provide insights other businesses can’t.

What do you think are the biggest challenges the industry is facing currently? Are there any innovative solutions to these or other challenges on the horizon?

Bot protection has always been the greatest challenge. Scraping, you see, is a cat-and-mouse game. Websites attempt to implement anti-bot measures, such as the well-known CAPTCHA, while scraping companies attempt to continue evading them to retain access to data.

There have been great strides made in bot protection. TLS (Transport Layer Security) fingerprinting has been one such improvement. Sophisticated websites can use initial network handshakes to match them with headers. As many scraping tools manually modify the headers sent, TLS can often be mismatched, which would be a dead giveaway.

On the other hand, the deck is always slightly stacked in the favor of scraping. Most anti-bot protection features put a dent in the overall user experience. Filling in a CAPTCHA is something that detracts from that frictionless experience of the modern web we’re used to.

Some businesses use these techniques and see no issue. Others, ones highly concerned with delivering the best user experience possible, avoid using CAPTCHAs unless absolutely necessary. It’s always a tradeoff. More bot protection equals, almost always, worse UX, which leads to less revenue. But then less people are scraping your website.

Additionally, new pages with interesting data and content appear all the time. And you don’t start building a website from bot protection. It has to be functional first. So, the process of scraping is a lot easier than it could be for a long time.

Would you say that there are potential benefits in web scraping for academic research or policy-making? If so, why hasn’t the scientific or political community adopted the practice?

Academic research, quantitative in particular, is in large part based on data that doesn’t exist on the internet, yet. There could be studies, however, on internet behavior or something of the like where scraping could be immensely useful. Additionally, I think we’re not seeing such widespread adoption due to the previously mentioned barrier to entry.

Let’s imagine that there’s no previous scraping experience in some particular university. The researcher would have to build everything from the ground up, get all the deep knowledge, and the funding required just to start acquiring the data.

It doesn’t help that the research areas that benefit the most from scraping (like sociology, economics, psychology, etc.) are far removed from the coding, development, and IT in general. I think it’s more of an unfortunate, but temporary, circumstance, because web scraping providers will be able to reduce the barrier by a significant margin in the future.

When it comes to policy-making, I’m not so sure. I think that rather than making, it should be about enforcing. Governments are definitely knee-deep in web scraping for all kinds of security purposes. Businesses, on the other hand, have been using the same processes to protect themselves from counterfeits and copyright infringement. There’s an entire business vertical dedicated explicitly to brand protection.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Business

How can a payments strategy support business growth?

Source: Finance Derivative

Following the global economic upheaval brought on by the pandemic, businesses are once again prioritising growth on a global scale. While every business recognises the importance of expansion, their methods, obstacles, and risks differ greatly.

In the following article, Sonya Geelon, Chief Commercial Officer at Conferma, explores some of the most common challenges holding businesses back, and how by including innovative payments solutions in your payment strategy, you can successfully position your business to expand into global markets.

Barriers to global expansion

At Conferma, we wanted to know what businesses felt stood between them and their growth ambitions, so we spoke to 400 financial decision makers to find out.

The research, shared in our new Growth Ignition Index report, identified global expansion as a key priority for businesses looking to grow across all regions. Significant drivers included increasing customer demand (46 per cent), maintaining a consistent cashflow (36 per cent) and undertaking digital transformation (34 per cent.) Businesses also highlighted a number of barriers, such as identifying valuable markets to expand into (27 per cent) and navigating complex cross-border payment systems (13 per cent.) The following sheds light on some of the factors that businesses perceive to be hindering their growth.

Operational inefficiencies

It’s a well-known fact that operational efficiency is crucial for giving businesses the competitive edge. If your processes run smoothly and effectively, you’re likely in a good position to grow. However, a third (33 per cent) of businesses identified operational inefficiencies as a significant sticking point, particularly among small-and-medium sized organisations. This perhaps indicates that larger companies have already invested in boosting efficiency to a degree, however, the issue was noted across businesses of all sizes.

Complex cross-border payments

Successful growth relies heavily on being able to make fast, seamless transactions, however, recent research from Rapyd found that 38 per cent of businesses experience delays of five days or more when sending or receiving international payments.[1] Costs and delays in cross-border transactions can have a significant impact on growth, cutting into revenues, restricting cash flow and complicating financial planning. Our own research highlighted this, with 14 per cent of businesses reporting slow and/or complex cross-border payments as a significant barrier to expansion.

So how can businesses overcome these challenges and unlock global growth?

Taking your payments strategy virtual

Amid the array of payment options available in the market, virtual cards have emerged as a versatile solution, valued by users globally. According to Juniper Research, the global value of virtual cards will increase over threefold in just 5 years, climbing from $1.9 trillion in 2021 to a staggering $6.8 trillion by 2026.[2]

So how do they work?

Virtual cards are essentially digital versions of traditional credit cards. The technology generates a 16-digit card  number, allowing an employee to make payments without having to physically hand over a card. Instead, they provide the virtual card number, expiration date, and security code, just like they would with a regular credit or debit card.

Virtual cards come with built-in fraud and security features, enabling restrictions on usage. For instance, users can set a specific date range or limit usage to certain merchants. This ensures that any attempts to exceed the set amount, use the card at unauthorised merchants, or spend beyond the specified date range will result in a declined transaction.

Using a virtual card provider allows access to extensive, pre-existing payments ecosystems. For example, Conferma connects 75+ card issuers and banks across the world. This enables businesses to use virtual cards in 62 different currencies, making international payments frictionless while mitigating costly cross-border fees. Virtual cards can also help boost cashflow and improve operational efficiency, automating reconciliation and cutting lengthy processing times. By removing convoluted payment processes, virtual cards give businesses the freedom to grow in the markets they deem most valuable, not just most accessible.

Of those surveyed, four out of five  respondents (82 per cent) plan on expanding their virtual card usage in the next twelve months, with 64 per cent extending usage to additional payment needs. Businesses already using virtual cards also anticipate a substantial increase in the volume of payments they make virtually, with our data projecting a rise from 45 to 57 per cent of all payments being made using virtual cards in the next 12 months.

Virtual cards offer a compelling solution to the challenges limiting international growth by offering enhanced security, streamlined operational processes, and seamless cross-border transactions. By embracing virtual cards as a strategic tool, organisations can unlock opportunities for growth and innovation, empowering them to navigate the complexities of international commerce with ease.


[1] The 2023 State of Cross-Border Payments, Rapyd, 2023.

[2] Virtual Cards: B2B and B2C Applications, Competitive Analysis & Market Forecasts 2021-2026, Juniper Research

Continue Reading

Business

How can businesses make the cloud optional in their operations?

Max Alexander, Co-founder at Ditto

Modern business apps are built to be cloud-dependent. This is great for accessing limitless compute and data storage capabilities but when connection to the cloud is poor or shuts down, business apps stop working, impacting revenue and service. If real-time data is needed for quick decision-making in fields like healthcare, a stalled app can potentially put people in life-threatening situations.

Organisations in sectors as diverse as airlines, fast food retail, and ecommerce that have deskless staff who need digital tools accessible on smartphones, tablets and other devices to do their jobs. But because of widespread connectivity issues and outages, these organisations are beginning to consider how to ensure these tools can operate reliably when the cloud is not accessible. 

The short answer is that building applications with a local-first architecture can help to ensure that they remain functional when disconnected from the internet. But then, why are not all apps built this way? The simple answer is that building and deploying cloud-only applications is much easier as ready-made tools for developers help expedite a lot of the backend building process. The more complex answer is that a local-first architecture solves the issue of offline data accessibility but does not solve the critical issue of offline data synchronisation. Apps disconnected from the internet still have no way to share data across devices. That is where peer-to-peer data sync and mesh networking come into play.

Combining offline-first architecture with peer-to-peer data sync

In the real world, what does an application like this look like?

  • Apps must prioritise local data sync. Rather than sending data to a remote server, applications must be able to write data using its local database in the first instance, and then listen for changes from other devices, and recombine them as needed. Apps should utilise local transports such as Bluetooth Low Energy (BLE) and Peer-to-Peer WiFi (P2P Wi-Fi) to communicate data changes in the event that the internet, local server, or the cloud is not available.
  • Devices are capable of creating real-time mesh networks. Nearby devices should be able to discover, communicate, and maintain constant connections with devices in areas of limited or no connectivity.
  • Seamlessly transition from online to offline (and vice versa). Combining local sync with mesh networking means that devices in the same mesh are constantly updating a local version of the database and opportunistically syncing those changes with the cloud when it is available.
  • Partitioned between large peer and small peer mesh networks to not overwhelm smaller networks if they try to sync every piece of data. In order to do this, smaller networks will only sync the data that it requests, so developers have complete control over bandwidth usage and storage. This is vital when connectivity is erratic or critical data needs prioritising. Whereas, the larger networks sync as much data as they can, which is when there is full access to cloud-based systems.
  • Ad-hoc to enable devices to join and leave the mesh when they need to. This also means that there can be no central server other devices are relying on.
  • Compatible with all data at any time. All devices should account for incoming data with different schemas. In this way, if a device is offline and running an outdated app version, for example, it still must be able to read new data and sync.

Peer-to-peer sync and mesh networking in practice

Let us take a look at a point-of-sale application in the fast-paced environment of a quick-service restaurant. When an order is taken at a kiosk or counter, that data must travel hundreds of miles to a data centre to arrive at a device four metres away in the kitchen. This is an inefficient process and can slow down or even halt operations, especially if there is an internet outage or any issues with the cloud.

A major fast-food restaurant in the US has already modernised its point of sale system using this new architecture and created one that can move order data between store devices independently of an internet connection. As such, this system is much more resilient in the face of outages, ensuring employees can always deliver best-in-class service, regardless of internet connectivity.

The vast power of cloud-optional computing is showcased in healthcare situations in rural areas in developing countries. By using both peer-to-peer data sync and mesh networking, essential healthcare applications can share critical health information without the Internet or a connection to the cloud. This means that healthcare workers in disconnected environments can now quickly process information and share it with relevant colleagues, empowering faster reaction times that can save lives.

Although the shift from cloud-only to cloud-optional is subtle and will not be obvious to end users, it really is a fundamental paradigm shift. This move provides a number of business opportunities for increasing revenue and efficiencies and helps ensure sustained service for customers.

Continue Reading

Business

When something personal fills an important gap in the market 

by Cécile Mazuet-Eller, founder of NameSwitch

There aren’t many business ideas that go from a personal experience to filling an important gap in the market. However, this is certainly the case for NameSwitch, the UK’s pioneering and only name changing support service launched in 2018. But what inspired its inception and what challenges did it face? Here, Cécile Mazuet-Eller, the founder of the company, in its seventh year, explains.

My entrepreneurial journey is a bit unusual in that it started from my own experience of going through a divorce, which became a pivotal turning point for me not only emotionally, but practically too. I wanted to remove my married name, and I had a visceral reason to do so as I really didn’t want to keep it. Feeling extremely frustrated at still receiving letters and official documents featuring my previous name, I was desperate to change it but like for so many people it became a stop-start, arduous task.

Once I started the process, I realised it was taking up far too much time I didn’t have; being a single mum to two young children and working full-time is no mean feat, so when I embarked on the name changing process I realised it wasn’t going to be easy.  Searching for a solution to help, all I came up with was a service covering the US and Canada, but nothing that worked for the UK, so in the end, I spent a whole year to get everything changed that had to be, which proved long and stressful to say the least.

Nurturing the idea

In the early days I was fortunate enough to be surrounded by positive people who had good contacts, and who saw the viability of my idea. Living in a small community filled with intelligent and well-rounded people, I wasn’t short of encouragement from them and friends, who recognised as well as I did there was a definite gap in the market. Working with a web development team in Serbia which was also recommended, I enlisted additional help from a university student on some research.

I always wanted to run my own business, and there were several reasons why I needed to embark on something new. As the only breadwinner in the house, there were mounting bills while balancing the demands of motherhood and other financial responsibilities. Cash was limited but what little I had was used carefully which I put into the business.

In the early stages, which included the development of the unique technology that underpins the service, I carved pockets of time at night and on weekends to create a strong foundation for the business. Creating something completely from scratch was like a form of healing, which is why it was and remains such a personal project.

Mulling over the idea for at least two years following the original lightbulb moment, the business was registered in 2015, with time needed for building the robust platform in order to  create a viable product. Drawing on my previous experience, I investigated overseas equivalents, financials and marketing intelligence ensuring there was a genuine need for the service in the UK. Fortunately enough I was able to share my plans with my employer at the time, who turned out to be my biggest supporters, becoming my first paying customer who purchased a NameSwitch for his ex-wife, who was getting married to someone else!

With a career in telecommunications and a degree in marketing, I was already used to hard work and having the support and encouragement from my telecoms team was extremely helpful.   

Support and coaching

Coaching was an important element of the start-up process, obtained through a wider network and some financial support from family,  with no other funding or investment being available.

The challenges

Presented with certain obstacles like all businesses are, there was a lot to juggle and at times it felt like too much but I managed to navigate the complexities involved. When Covid hit that was a huge set-back, given that our biggest target market was and still is, newly-weds. With all weddings being banned, it hit NameSwitch hard, but our saving grace were the people who used the time to change their name’s in lockdown, by doing something they previously didn’t have time for. Being 100% employed by the business by this stage, it turned into a year of survival and another big challenge.  

In 2022-2023 we concentrated on growth for NameSwitch, when me and my dedicated team were satisfied with the service, it was time to consider investment into PR, advertising and partnerships to increase brand awareness to reach the revenues that were needed.

In 2022-2024, it was forecast that 285,000 – 415,000 weddings will take place resulting from the pandemic, which has reflected well on the business in recent years. And amidst the trials and tribulations it’s proved to be both exhilarating and exhausting in equal measure.

With hindsight, there are certain things I’d have done differently, such as bringing in a partner early on to put us in a stronger position sooner, and adding more resource  to improve growth, but I know that’s all part of the steep learning curve and something to take with me to projects in the future.

Advice for aspiring entrepreneurs

For anyone contemplating their own entrepreneurial endeavours, I’d recommend to ‘one hundred percent go for it’ – but do not bet the house on it and whatever happens, embrace the journey.

Continue Reading

Copyright © 2021 Futures Parity.