{"id":20967,"date":"2025-01-22T12:25:23","date_gmt":"2025-01-22T12:25:23","guid":{"rendered":"https:\/\/www.scrapingdog.com\/?p=20967"},"modified":"2025-08-14T09:49:52","modified_gmt":"2025-08-14T09:49:52","slug":"playwright-with-nodejs","status":"publish","type":"post","link":"https:\/\/www.scrapingdog.com\/blog\/playwright-with-nodejs\/","title":{"rendered":"Complete Guide on Using Playwright with Nodejs"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"20967\" class=\"elementor elementor-20967\" data-elementor-post-type=\"post\">\n\t\t\t\t<div class=\"elementor-element elementor-element-f4db730 e-flex e-con-boxed e-con e-parent\" data-id=\"f4db730\" data-element_type=\"container\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t\t\t<div class=\"elementor-element elementor-element-16892fb elementor-widget elementor-widget-html\" data-id=\"16892fb\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<!-- Gutenberg \u201cCustom HTML\u201d block -->\r\n<div style=\"\r\n  background:#d9f4e5;\r\n  border-left:4px solid #1d9b6c;\r\n  padding:18px 24px;\r\n  margin:24px 0;\r\n  border-radius:6px;\r\n  font-family:'Montserrat',sans-serif;\r\n  font-size:18px;\r\n  line-height:1.65;\r\n  color:#1a1a1a;\">\r\n  <p style=\"margin:0 0 8px 0;font-weight:600;\">TL;DR<\/p>\r\n\r\n  <ul style=\"margin:0; padding-left:20px;\">\r\n    <li><strong>Playwright<\/strong> + <strong>Node.js<\/strong> setup, launch, and first run.<\/li>\r\n    <li>Scrape flow: load page &rarr; grab HTML &rarr; parse with <code>Cheerio<\/code> (<em>IMDb<\/em> demo).<\/li>\r\n    <li>Techniques: <code>waitForSelector<\/code>, infinite scroll loop, type \/ click flows.<\/li>\r\n    <li>Use proxies for scale; Playwright is powerful, but for hands-off scaling the article recommends <strong>Scrapingdog<\/strong>.<\/li>\r\n  <\/ul>\r\n<\/div>\r\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dbaf1b3 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"dbaf1b3\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c1b8\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Recently, I wrote an article on\u00a0<a class=\"af mf\" href=\"https:\/\/www.scrapingdog.com\/blog\/how-to-use-selenium-in-nodejs\/\" target=\"_blank\" rel=\"noopener\">how to use Selenium with Node.js<\/a>\u00a0and posted it on\u00a0<a class=\"af mf\" href=\"https:\/\/www.reddit.com\/r\/programming\/s\/Kd5RkrynSr\" target=\"_blank\" rel=\"nofollow noopener\">Reddit<\/a>. I got zero upvotes and a few comments on it. Most\u00a0<span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">comments suggested using\u00a0<a href=\"https:\/\/playwright.dev\/\" target=\"_blank\" rel=\"noopener\">Playwright<\/a>\u00a0instead of\u00a0<a href=\"https:\/\/www.selenium.dev\/\" target=\"_blank\" rel=\"noopener\">Selenium<\/a> for web scraping, so I am doing the same via this read.<\/span><\/p><p data-selectable-paragraph=\"\">We will learn <em>how to use Playwright for scraping<\/em>, <em>how to wait until an element appears<\/em>, and more. I will explain the step-by-step process for almost every feature Playwright offers for web scraping.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ca93929 elementor-widget elementor-widget-heading\" data-id=\"ca93929\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Setup<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-969d71f font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"969d71f\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"593b\" class=\"pw-post-body-paragraph lh li fq lj b lk ne lm ln lo nf lq lr ls ng lu lv lw nh ly lz ma ni mc md me fj bk\" data-selectable-paragraph=\"\">I hope you have already installed Nodejs on your machine if not then you can download it from\u00a0<a class=\"af mf\" href=\"https:\/\/nodejs.org\/en\/download\" target=\"_blank\" rel=\"nofollow noopener\">here<\/a>.<\/p><p id=\"6460\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">After that, you have to create a folder and initialize the\u00a0<code class=\"cx nj nk nl nm b\">package.json<\/code>\u00a0file in it.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-51f1d5f elementor-widget elementor-widget-code-highlight\" data-id=\"51f1d5f\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>mkdir play\r\ncd play\r\nnpm init<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-87499f9 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"87499f9\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e837\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Then install Playwright and Cheerio.\u00a0<a class=\"af mf\" href=\"https:\/\/www.npmjs.com\/package\/cheerio\" target=\"_blank\" rel=\"noopener ugc nofollow\">Cheerio<\/a>\u00a0will be sued for parsing raw HTML.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0bb5e0f elementor-widget elementor-widget-code-highlight\" data-id=\"0bb5e0f\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>npm install playwright cheerio<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8194ada font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"8194ada\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e837\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Once Playwright is installed you have to install a browser as well.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0e4cdae elementor-widget elementor-widget-code-highlight\" data-id=\"0e4cdae\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>npx playwright install<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4472268 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"4472268\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e837\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">The installation part is done. Let\u2019s test the setup now.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0fd8a4a elementor-widget elementor-widget-heading\" data-id=\"0fd8a4a\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">How to run Playwright with Nodejs<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9dd6e6a elementor-widget elementor-widget-code-highlight\" data-id=\"9dd6e6a\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const { chromium } = require('playwright');\r\n\r\nasync function playwrightTest() {\r\n  const browser = await chromium.launch({ headless: false });\r\n  const context = await browser.newContext();\r\n  const page = await context.newPage();\r\n\r\n  await page.goto('https:\/\/www.scrapingdog.com');\r\n  console.log(await page.title());\r\n\r\n  await browser.close();\r\n}\r\n\r\nplaywrightTest()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ee61af1 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"ee61af1\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"19b3\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">This code first imports the\u00a0<code class=\"cx nj nk nl nm b\">chromium<\/code>\u00a0browser object from the playwright library. Then we launch the browser using\u00a0<code class=\"cx nj nk nl nm b\">launch()<\/code>\u00a0method. Then we navigate to\u00a0<a class=\"af mf\" href=\"https:\/\/www.scrapingdog.com\/\" target=\"_blank\" rel=\"noopener\">www.scrapingdog.com<\/a>\u00a0using\u00a0<code class=\"cx nj nk nl nm b\">goto()<\/code>\u00a0method.<\/p><p id=\"0294\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">The web page&#8217;s title is fetched using\u00a0<code class=\"cx nj nk nl nm b\">page.title()<\/code>\u00a0and logged to the console. The browser is closed to clean up resources.<\/p><p id=\"5ef8\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Once you run the code you will get this on your console.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:501\/1*S1EcUhWV8baOhYkkk2epRA.png\" \/><\/p><p data-selectable-paragraph=\"\">This completes the testing of our setup.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4062dff elementor-widget elementor-widget-heading\" data-id=\"4062dff\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">How to scrape with Playwright and Nodejs<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f93f17f font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"f93f17f\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"4d91\" class=\"pw-post-body-paragraph lh li fq lj b lk ne lm ln lo nf lq lr ls ng lu lv lw nh ly lz ma ni mc md me fj bk\" data-selectable-paragraph=\"\">In this section, we are going to scrape a\u00a0<a class=\"af mf\" href=\"https:\/\/www.imdb.com\/chart\/moviemeter\/\" target=\"_blank\" rel=\"nofollow noopener\">page<\/a>\u00a0from the IMDB.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0ae5490 elementor-widget elementor-widget-code-highlight\" data-id=\"0ae5490\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const { chromium } = require('playwright');\r\n\r\nasync function playwrightTest() {\r\n  const browser = await chromium.launch({\r\n    headless: false, \/\/ Set to true in production\r\n    args: [\r\n      '--disable-blink-features=AutomationControlled',\r\n      \/\/ '--use-subprocess' \/\/ Uncomment if needed\r\n    ]\r\n  });\r\n\r\n  const context = await browser.newContext();\r\n  const page = await context.newPage();\r\n\r\n  await page.goto('https:\/\/www.imdb.com\/chart\/moviemeter\/');\r\n  console.log(await page.content());\r\n\r\n  await browser.close();\r\n}\r\n\r\nplaywrightTest()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bbdf53c font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"bbdf53c\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"4d91\" class=\"pw-post-body-paragraph lh li fq lj b lk ne lm ln lo nf lq lr ls ng lu lv lw nh ly lz ma ni mc md me fj bk\" data-selectable-paragraph=\"\">With\u00a0<code class=\"cx nj nk nl nm b\">page.content()<\/code>\u00a0method we are extracting the raw HTML from our target webpage. Once you run the code you will see this on your console.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:714\/1*FSBHhkO5-qpmw1AgM5N4IA.png\" \/><\/p><p data-selectable-paragraph=\"\">You must be thinking that this data is just garbage. Well, you are right we have to parse the data out of this raw HTML and this can be done with a parsing library like Cheerio.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:679\/1*WnFYxQpRz0G3Mg9keT18BA.png\" \/><\/p><p data-selectable-paragraph=\"\">We are going to parse the name of the movie and the rating. Let\u2019s find out the DOM location of each element.<\/p><p data-selectable-paragraph=\"\">Every movie data is stored inside a\u00a0<code class=\"cx nj nk nl nm b\">li<\/code>\u00a0tag with the class\u00a0<code class=\"cx nj nk nl nm b\">ipc-metadata-list-summary-item<\/code>.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:875\/1*H3pdYIthewP6bAJ_AuFM9g.png\" \/><\/p><p data-selectable-paragraph=\"\">If you go inside this\u00a0<code class=\"cx nj nk nl nm b\">li<\/code>\u00a0tag you will see that the title of the movie is located inside a\u00a0<code class=\"cx nj nk nl nm b\">h3<\/code>\u00a0tag with\u00a0<code class=\"cx nj nk nl nm b\">class ipc-title__text<\/code>.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:875\/1*xjdbLpuHjWHn7DSQ-UD02Q.png\" \/><\/p><p data-selectable-paragraph=\"\">The rating is located inside the\u00a0<code class=\"cx nj nk nl nm b\">span<\/code>\u00a0tag with class\u00a0<code class=\"cx nj nk nl nm b\">ipc-rating-star \u2014 rating<\/code>.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-612295e elementor-widget elementor-widget-code-highlight\" data-id=\"612295e\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const { chromium } = require('playwright');\r\nconst cheerio = require('cheerio')\r\n\r\n\r\nasync function playwrightTest() {\r\n  let obj={}\r\n  let arr=[]\r\n  const browser = await chromium.launch({\r\n    headless: false,    \r\n  });\r\n\r\n  const context = await browser.newContext();\r\n  const page = await context.newPage();\r\n\r\n  await page.goto('https:\/\/www.imdb.com\/chart\/moviemeter\/');\r\n  let html = await page.content()\r\n  const $ = cheerio.load(html);\r\n\r\n  $('li.ipc-metadata-list-summary-item').each((i,el) => {\r\n    obj['Title']= $(el).find('h3.ipc-title__text').text().trim()\r\n    obj['Rating']=$(el).find('span.ipc-rating-star').text().trim()\r\n    arr.push(obj)\r\n    obj={}\r\n  })\r\n  console.log(arr)\r\n  await browser.close();\r\n}\r\n\r\nplaywrightTest()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e8d1521 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"e8d1521\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"dc3a\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Using\u00a0<code class=\"cx nj nk nl nm b\">each()<\/code>\u00a0function we are iterating over all the\u00a0<code class=\"cx nj nk nl nm b\">li<\/code>\u00a0tags and extracting Titke and Ratings of all the movies.<\/p><p id=\"0254\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Before closing the browser we are going to print the output.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:708\/1*d0BVp2gdKuEO_RlzDGf3Mg.png\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f0e017c elementor-widget elementor-widget-heading\" data-id=\"f0e017c\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">How to wait for an element in Playwright<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2c4aefd font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"2c4aefd\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"dc3a\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Sometimes while scraping a website you might have to wait for certain elements to appear before scraping begins. In this case, you have to use\u00a0<code class=\"cx nj nk nl nm b\">page.waitForSelector()<\/code>\u00a0function for waiting for an element.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e891be3 elementor-widget elementor-widget-code-highlight\" data-id=\"e891be3\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const { chromium } = require('playwright');\r\nconst cheerio = require('cheerio')\r\n\r\n\r\nasync function playwrightTest() {\r\n \r\n  const browser = await chromium.launch({\r\n    headless: false\r\n  });\r\n\r\n  const context = await browser.newContext();\r\n  const page = await context.newPage();\r\n\r\n  await page.goto('https:\/\/www.imdb.com\/chart\/moviemeter\/');\r\n  await page.waitForSelector('h1.ipc-title__text')\r\n  await browser.close();\r\n}\r\n\r\nplaywrightTest()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-240d322 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"240d322\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"dc3a\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Here we are waiting for the title to appear before we close the browser.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-93a80ba elementor-widget elementor-widget-heading\" data-id=\"93a80ba\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">How to do Infinite Scrolling with Playwright<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2f15646 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"2f15646\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"dc3a\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Many e-commerce websites have infinite scrolling and you might have to scroll down in order to scroll the whole page.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3525d71 elementor-widget elementor-widget-code-highlight\" data-id=\"3525d71\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const { chromium } = require('playwright');\r\nconst cheerio = require('cheerio')\r\n\r\n\r\nasync function playwrightTest() {\r\n\r\n  const browser = await chromium.launch({\r\n    headless: false\r\n  });\r\n\r\n  const context = await browser.newContext();\r\n  const page = await context.newPage();\r\n\r\n  await page.goto('https:\/\/www.imdb.com\/chart\/moviemeter\/');\r\n  let previousHeight;\r\n  while (true) {\r\n  previousHeight = await page.evaluate('document.body.scrollHeight');\r\n  await page.evaluate('window.scrollTo(0, document.body.scrollHeight)');\r\n  await page.waitForTimeout(2000); \/\/ Wait for new content to load\r\n\r\n  const newHeight = await page.evaluate('document.body.scrollHeight');\r\n  if (newHeight === previousHeight) break;\r\n}\r\n  await browser.close();\r\n}\r\n\r\nplaywrightTest()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cceb458 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"cceb458\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"08c7\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">We are using\u00a0<code class=\"cx nj nk nl nm b\">while(true)<\/code>\u00a0to keep scrolling until we no longer have any new content loading.\u00a0<code class=\"cx nj nk nl nm b\">await page.evaluate(\u2018window.scrollTo(0, document.body.scrollHeight)\u2019)<\/code>\u00a0Scrolls the page to the bottom by setting the vertical scroll position (<code class=\"cx nj nk nl nm b\">window.scrollTo<\/code>) to the maximum scrollable height. Once\u00a0<code class=\"cx nj nk nl nm b\">newHeight<\/code>\u00a0and\u00a0<code class=\"cx nj nk nl nm b\">previousHeight<\/code>\u00a0becomes equal we are breaking out of the loop.<\/p><p id=\"7613\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Let\u2019s see this in action.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:875\/1*4-ED3yVebUGrCAHA1oxDBw.gif\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-072e86f elementor-widget elementor-widget-heading\" data-id=\"072e86f\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">How to type and click<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8a671b6 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"8a671b6\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"08c7\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">In this example, we are going to simply visit\u00a0<a class=\"af mf\" href=\"http:\/\/www.google.com,\/\" target=\"_blank\" rel=\"nofollow noopener\">www.google.com,<\/a>\u00a0enter a query, and click on the enter button. After that, we are going to scrape the results using\u00a0<code class=\"cx nj nk nl nm b\">page.content()<\/code>\u00a0method.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4acbc8c elementor-widget elementor-widget-code-highlight\" data-id=\"4acbc8c\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const { chromium } = require('playwright');\r\nconst cheerio = require('cheerio')\r\n\r\n\r\nasync function playwrightTest() {\r\n\r\n  const browser = await chromium.launch({\r\n    headless: false\r\n    \r\n  });\r\n  const context = await browser.newContext();\r\n  const page = await context.newPage();\r\n\r\n  await page.goto('https:\/\/www.google.com');\r\n\r\n  await page.fill('textarea[name=\"q\"]', 'Scrapingdog');\r\n  await page.press('textarea[name=\"q\"]', 'Enter');\r\n  await page.waitForTimeout(3000);\r\n  console.log(await page.content())\r\n\r\n  await browser.close();\r\n}\r\n\r\nplaywrightTest()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-612a5b0 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"612a5b0\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"08c7\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">We are simply visiting google.com then we are typing \u2018<strong class=\"lj fr\">Scrapingdog<\/strong>\u2019 in the search query using\u00a0<code class=\"cx nj nk nl nm b\">fill()<\/code>\u00a0method and then using the\u00a0<code class=\"cx nj nk nl nm b\">press()<\/code>\u00a0method we pressed the Enter button.<\/p><p data-selectable-paragraph=\"\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:875\/1*10v67ys2qALcE2AsqzYnlw.gif\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3e8a565 elementor-widget elementor-widget-heading\" data-id=\"3e8a565\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">How to Use Proxies with Playwright<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b1decbb font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"b1decbb\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"08c7\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">If you want to scrape a few hundred pages then old traditional methods are fine but if you want to scrape millions of pages then you have to use proxies in order to bypass IP banning.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0e50696 elementor-widget elementor-widget-code-highlight\" data-id=\"0e50696\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>const browser = await chromium.launch({\r\n    headless: false,   \r\n    proxy: {\r\n      server: 'http:\/\/IP:PORT',\r\n      username: 'PASSWORD',\r\n      password: 'USERNAME'\r\n  }\r\n  });<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-76f6f19 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"76f6f19\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"08c7\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\"><code class=\"cx nj nk nl nm b\">server<\/code>\u00a0specifies the proxy server\u2019s address in the format:\u00a0<code class=\"cx nj nk nl nm b\">protocol:\/\/IP:PORT<\/code>.\u00a0<code class=\"cx nj nk nl nm b\">username<\/code>\u00a0and\u00a0<code class=\"cx nj nk nl nm b\">password<\/code> are the credentials for accessing that private IP. If it is public, you might not need a username and password.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-584d562 elementor-widget elementor-widget-heading\" data-id=\"584d562\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Conclusion<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-de3f4ce font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"de3f4ce\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"eq er es et eu l\"><article><div class=\"l\"><div class=\"l\"><section><div><div class=\"fj fk fl fm fn\"><div class=\"ab cb\"><div class=\"ci bh ev ew ex ey\"><p id=\"6725\" class=\"pw-post-body-paragraph lh li fq lj b lk ne lm ln lo nf lq lr ls ng lu lv lw nh ly lz ma ni mc md me fj bk\" data-selectable-paragraph=\"\">Playwright, with its robust and versatile API, is a powerful tool for automating browser interactions and web scraping in Node.js. Whether you\u2019re scraping data, waiting for elements, scrolling, or interacting with complex web elements like buttons and input fields, Playwright simplifies these tasks with its intuitive methods. Moreover, its support for proxies and built-in features like screenshot capturing and multi-browser support make it a reliable choice for developers.<\/p><p id=\"099a\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">I understand that these tasks can be time-consuming, and sometimes, it\u2019s better to focus solely on data collection while leaving the heavy lifting to web scraping APIs like Scrapingdog. With Scrapingdog, you don\u2019t have to worry about managing proxies, browsers, or retries \u2014 it takes care of everything for you. With just a simple GET request, you can scrape any page effortlessly using this API.<\/p><p id=\"1074\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">If you found this article helpful, please consider sharing it with your friends and followers on social media!<\/p><\/div><\/div><\/div><\/div><\/section><\/div><\/div><\/article><\/div>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-99975d9 elementor-widget elementor-widget-heading\" data-id=\"99975d9\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Additional Resources<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-69cd5b1 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"69cd5b1\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/web-scraping-with-nodejs\/\" target=\"_blank\" rel=\"noopener\">Web Scraping with Javascript &amp; Nodejs<\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/javascript-web-crawler-nodejs\/\" target=\"_blank\" rel=\"noopener\">How To Do Web Crawling using Javascript in Nodejs<\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/puppeteer-web-scraping\/\" target=\"_blank\" rel=\"noopener\">Web Scraping using Puppeteer &amp; Nodejs<\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/nodejs-axios-proxy\/\" target=\"_blank\" rel=\"noopener\">How To Use a Proxy with Axios &amp; Nodejs<\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-google-jobs-using-nodejs\/\" target=\"_blank\" rel=\"noopener\">Scraping Google Jobs data using Nodejs<\/a><\/li><\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>In this blog, we will be using Playwright with Nodejs to use for scraping and other automation tasks.<\/p>\n","protected":false},"author":5,"featured_media":20973,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[25,89],"tags":[],"class_list":["post-20967","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","category-nodejs"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts\/20967","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/comments?post=20967"}],"version-history":[{"count":0,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts\/20967\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/media\/20973"}],"wp:attachment":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/media?parent=20967"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/categories?post=20967"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/tags?post=20967"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}