Tech Accelerator

screen scraping


Screen scraping is the act of copying information that shows on a digital display so it can be used for another purpose. Visual data can be collected as raw text from on-screen elements such as a text or images that appear on the desktop, in an application or on a website. Screen scraping can be performed automatically with a scraping program or manually with an individual extracting data.

Screen scraping has a variety of uses, both ethical and unethical. Brief examples of both include either an app for banking, for gathering data from multiple accounts for a user, or for stealing data from applications. A developer might be tempted to steal from another application to make the process of development faster and easier for themselves.

What is it used for?

Screen scrapers have been applied in a broad number of fields for a variety of use cases. Some potential uses include:

  • banking applications and financial transactions;
  • saving meaningful data for later use;
  • to perform actions a user would on a website;
  • to translate data from a to a modern application;
  • for data aggregators such as price comparison websites;
  • to track user profiles to see online activities; and
  • to steal data.

One of the largest use cases has been in banking. Lenders may want to use screen scraping to gather a customer's financial data. Financial-based applications may use screen scraping to access multiple accounts from a user, aggregating all the information in one place. Users would need to explicitly trust the application, however, as they are trusting that organization with their accounts, customer data and passwords. Screen scraping can also be used for mortgage provider applications.

An organization might also want to use screen scraping to translate between legacy application programs and new user interfaces () so that the logic and data associated with the legacy programs can continue to be used. This option is rarely used and is only seen as an option when other methods are impractical.

This article is part of

  • Which also includes:
Basic screen scraping

If an individual can gain access to the underlying code in an application, the user could use screen scraping to steal the code and use it in their own application. This would save the individual time and effort or allow them to learn how a feature in an application works without permission.

A portion of the time, screen scraping will involve a third-party system. For example, screen scraping would allow a third-party organization to access data on financial transactions in a budgeting app.

Screen scraping has changed its main use cases over time. A recent example of this comes from 2019 when screen scraping began to be phased out of one of its larger use cases, banking. This was done to ease security concerns surrounding the practice. Budgeting apps now must use a single, open banking technology.

How does screen scraping work?

Screen scraping can be accomplished in several ways, depending on what the process is being used for. For example, through Java, an individual can copy and paste from one application into their own if they have a pathway of direct access to it.

In general, screen scraping allows a user to extract screen display data from a specific UI element or documents. Different methods can be used to obtain all the text on a page, unformatted, or all the text on a page, formatted, with exact positioning. Screen scrapers can be based around applications such as or , which allows users to obtain information from in a browser. Unix tools, such as Shell scripts, can also be used as a simple screen scraper.

In banking, a third-party will request users share their login information so they can access financial transaction data by logging into digital portals for the customers. A budgeting app can then retrieve the incoming and outgoing transactions across accounts.

Regarding the use of transferring data from a legacy program, a data scraping program must take the data coming from the legacy program that is formatted for the of an older type of terminal such as an IBM display and reformat it for Windows 10 or someone using a web . The program must also reformat user input from the newer user interfaces (such as a Windows or a web browser) so that the request can be handled by the legacy application as if it came from the user of the older device and user interface.

How to prevent screen scraping

Unfortunately, there is no one definitive way to prevent screen scraping from happening. However, there are ways to help deter it from happening. An organization can detect screen scraping through a few given signatures or use behaviors. For example, if a nonstandard user agent is detected, if JavaScript fails to run client-side or several page request sequences are made, it may be a sign of screen scraping.

To help deter screen scaping, an organization can:

  • use , because screen scrapers will not be able to see a password until it is used;
  • use , which can help detect signature- or behavior-based actions;
  • set a value to be checked by the webserver in ;
  • make sure endpoints or aren't exposed;
  • run fraud detection software to catch screen scraping potentially while it is happening; and/or
  • set content to be shown as an image, which won't stop screen scraping from happening but will stop programs that can't translate images.

All these methods can help deter screen scraping, but it won't stop it completely. In addition, organizations must make sure that their actions won't make the end-user experience worse. For example, setting a website's content to appear as an image may make it difficult for individuals to find the page, because it will affect how search engines find the page to begin with.

Screen scraping tools

If individuals don't want to screen scrape manually, there are several tools that can help automate the process, such as:

  • UiPath
  • Jacada
  • FMiner
  • Macro Scheduler
  • ScreenScraper Studio
  • Existek

These tools include automation features such as automated user interfaces, macro recorders and editors. They work with Windows or web applications. Some tools have specific features over others and focus on specific platforms.

Screen scraping vs. web scraping

While screen scraping is the process of extracting data shown on a screen, web scraping extracts data from the web. The two concepts share many similarities to the point where it can be said that web scraping is like a specific type of screen scraping. The main differences lie in where the data is being taken from and what is it being used for.

Web scraping is used to extract data exclusively from the web -- unlike screen scraping, which can also scrape data from a user's desktop or applications. This form of data extraction can be used to compare prices for goods on an , for web indexing and .

The process accesses the web through over a web browser and can either be done manually or automatically through a or .

Difference between screen scraping and data scraping

Data scraping is a variant of screen scraping that is used to copy data from documents and web applications. Data scraping is a technique where structured, human-readable data is extracted. This method is mostly used for exchanging data with a legacy system and making it readable by modern applications.

Screen scraping and open banking

Open banking is the concept of sharing secured financial information to be used by third-party developers for the creation of banking applications. This concept is based on the sharing of APIs, which allows an application to use the same API to aggregate information from different accounts into one place. This is what allows a banking app to let users look at their multiple accounts from different banks in one place.

In the past, some banking apps would gather information using screen scraping. This process would require a user to share their bank logon credentials to the third-party app. The application would then log on to the user's accounts on his or her behalf and screen scrape the needed data to show in-app.

By contrast, open banking now uses shared APIs, meaning the exact data needed is copied without requiring the user to share logon credentials. The concept was introduced in 2018 and is now becoming a standard over the use of screen scraping.

This was last updated in February 2020

Next Steps

Read our comprehensive

Continue Reading About screen scraping

Dig Deeper on IBM system z and mainframe systems

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

  • Learn how this free PowerShell-based utility digs into your Office 365 security setup and offers guidance for administrators ...

  • Administrators of Office 365 or Azure platforms can benefit from using newer Microsoft software technologies to handle many jobs,...

  • Microsoft's hosting service for DNS domains is an option for organizations with a heavy Azure investment that want the benefits ...

  • Mini PCs are a low-cost hardware alternative to servers that enable organizations to maintain maximum data center features and ...

  • VDI has specific hardware needs that servers hosting other virtualized workloads may not meet. Learn how to gauge VDI hardware ...

  • This year's VMworld conference runs virtually from Sep. 29 to Oct. 1. Read the latest news and announcements about and from the ...

  • The public cloud providers closed the 2010s as one of the dominant forces in IT, but they'll face competition in the 2020s from ...

  • Put your IT team in the best position to succeed with AI. Develop these machine learning skills and see how they translate to the...

  • Get to know AWS cloud networking services for load balancing, traffic routing, content delivery and more with this overview.