{"id":5524,"date":"2025-11-24T02:20:46","date_gmt":"2025-11-24T02:20:46","guid":{"rendered":"https:\/\/scrapingdog.com\/?p=5524"},"modified":"2025-11-25T09:05:38","modified_gmt":"2025-11-25T09:05:38","slug":"scrape-amazon","status":"publish","type":"post","link":"https:\/\/www.scrapingdog.com\/blog\/scrape-amazon\/","title":{"rendered":"Scrape Amazon Using Python (Updated)"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"5524\" class=\"elementor elementor-5524\" data-elementor-post-type=\"post\">\n\t\t\t\t<div class=\"elementor-element elementor-element-4f38987 e-con-full e-flex e-con e-parent\" data-id=\"4f38987\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-3633b9d elementor-widget elementor-widget-html\" data-id=\"3633b9d\" data-element_type=\"widget\" data-widget_type=\"html.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<!-- Gutenberg \u201cCustom HTML\u201d block -->\n<div style=\"\n  background:#d9f4e5;\n  border-left:4px solid #1d9b6c;\n  padding:18px 24px;\n  margin:24px 0;\n  border-radius:6px;\n  font-family:'Montserrat',sans-serif;\n  font-size:18px;\n  line-height:1.65;\n  color:#1a1a1a;\">\n  <p style=\"margin:0 0 8px 0;font-weight:600;\">TL;DR<\/p>\n\n  <ul style=\"margin:0; padding-left:20px;\">\n    <li>Walks you through how to scrape product pages on Amazon using Python with <code>requests<\/code> + <code>BeautifulSoup<\/code> (for title, images, price, rating, specs).<\/li>\n    <li>Shows how to mimic browser-like headers to bypass Amazon\u2019s anti-bot mechanisms.<\/li>\n    <li>Details how to extract high-resolution images via regex search for <code>hiRes<\/code> in the page\u2019s &lt;script&gt; content.<\/li>\n    <li>Provides a full example script with rotating user-agents for basic scraping.<\/li>\n    <li>Explains when you need to scale: using a proxy\/API solution (specifically Scrapingdog\u2019s Amazon Scraper API) to avoid IP blocks and handle high volume.<\/li>\n    <li>Covers how to call that API (by ASIN, domain, postal-code-based locale) and other related endpoints (offers, autocomplete) for richer Amazon data.<\/li>\n    \n  <\/ul>\n<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-257f3c1 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"257f3c1\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The e-commerce industry has grown in recent years, transforming from a mere convenience to an essential facet of our daily lives.<\/p><p>As digital storefronts multiply and consumers increasingly turn to online shopping, there\u2019s an increasing demand for data that can drive decision-making, competitive strategies, and <span class=\"font-600\">customer engagement<\/span> in the digital marketplace.<\/p><p>Additionally, scraped Amazon product data can significantly enhance\u00a0<a href=\"https:\/\/synthflow.ai\/blog\/customer-service-automation\" target=\"_blank\" rel=\"noopener\">customer service automation<\/a>\u00a0by providing customer service teams with real-time product information, pricing details, and availability status, enabling them to respond more efficiently to customer inquiries and resolve issues faster.<\/p><p>If you are into an e-commerce niche, scraping Amazon can give you a lot of data points to understand the market.<\/p><p>In this guide, we will <span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">use Python to scrape Amazon, do\u00a0<a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-prices\/\" target=\"_blank\" rel=\"noopener\">price scraping<\/a> from this platform,<\/span> and demonstrate how to extract crucial information to help you make well-informed decisions in your business.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-e45f045 e-con-full e-flex e-con e-child\" data-id=\"e45f045\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-8c2e349 elementor-widget elementor-widget-heading\" data-id=\"8c2e349\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Setting up the prerequisites<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-28e90ba font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"28e90ba\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>I am assuming that you have already installed <code>python 3.x<\/code> on your machine. If not then you can download it from <a href=\"https:\/\/www.python.org\/downloads\/\" target=\"_blank\" rel=\"nofollow noopener\"><span class=\"font-600\">here.<\/span><\/a> Apart from this, we will require two III-party libraries of Python.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d59c079 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"d59c079\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul class=\"wp-block-list\"><li><a href=\"https:\/\/pypi.org\/project\/requests\/\" target=\"_blank\" rel=\"nofollow noopener\"><strong class=\"font-600\">Requests<\/strong><\/a>\u2013 We will use this library to connect HTTP with the Amazon page. This library will help us to extract the raw HTML from the target page.<\/li><li><a href=\"https:\/\/www.crummy.com\/software\/BeautifulSoup\/bs4\/doc\/\" target=\"_blank\" rel=\"nofollow noopener\"><strong class=\"font-600\">BeautifulSoup<\/strong><\/a>\u2013 This is a powerful data parsing library. Using this we will extract necessary data out of the raw HTML we get using the requests library.<\/li><\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a3e3fcd font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"a3e3fcd\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Before we install these libraries we will have to create a dedicated folder for our project.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-951afec elementor-widget elementor-widget-code-highlight\" data-id=\"951afec\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>mkdir amazonscraper<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7f17123 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"7f17123\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now, we will have to install the above two libraries in this folder. Here is how you can do it.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c7c6c4f elementor-widget elementor-widget-code-highlight\" data-id=\"c7c6c4f\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>pip install beautifulsoup4\r\npip install requests<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bc62fd3 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"bc62fd3\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tNow, you can create a Python file by any name you wish. This will be the main file where we will keep our code. I am naming it\u00a0<code>amazon.py<\/code>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-9a64944 e-con-full e-flex e-con e-child\" data-id=\"9a64944\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-35d3af5 elementor-widget elementor-widget-heading\" data-id=\"35d3af5\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Downloading raw data from amazon.com<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2da5290 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"2da5290\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tLet\u2019s make a normal GET request to our target page and see what happens. For GET request we are going to use the\u00a0<code class=\"font-600\">requests<\/code>\u00a0library.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6bff6ec elementor-widget elementor-widget-code-highlight\" data-id=\"6bff6ec\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>import requests\r\nfrom bs4 import BeautifulSoup\r\n\r\ntarget_url=\"https:\/\/www.amazon.com\/dp\/B0BSHF7WHW\"\r\n\r\nresp = requests.get(target_url)\r\n\r\nprint(resp.text)<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-41bbf20 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"41bbf20\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Once you run this code, you might see this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4756f28 elementor-widget elementor-widget-image\" data-id=\"4756f28\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"800\" height=\"58\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-4-1.jpg\" class=\"attachment-large size-large wp-image-7052\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-4-1.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-4-1-300x22.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-4-1-768x56.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8d57e3d font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"8d57e3d\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThis is a captcha from amazon.com and this happens once their architecture observes that the incoming request is from a\u00a0<em>bot\/script<\/em>\u00a0and not from a\u00a0<strong><em>real human being<\/em><\/strong>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f103334 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"f103334\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>To bypass this on-site protection of\u00a0<span class=\"font-600\">Amazon<\/span>\u00a0we can send some headers like <a href=\"https:\/\/www.scrapingdog.com\/blog\/user-agent-in-web-scraping\/\" target=\"_blank\" rel=\"noopener\">User-Agent<\/a>. You can even check what headers are sent to amazon.com once you open the URL in your browser. You can check them from the\u00a0<strong>network<\/strong>\u00a0tab.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e6e4507 elementor-widget elementor-widget-image\" data-id=\"e6e4507\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"699\" height=\"608\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-5-1.jpg\" class=\"attachment-large size-large wp-image-7076\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-5-1.jpg 699w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-5-1-300x261.jpg 300w\" sizes=\"(max-width: 699px) 100vw, 699px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-60fe1e1 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"60fe1e1\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Once you pass this header to the request, your request will act like a request coming from a real browser. This can melt down the anti-bot wall of\u00a0amazon.com. Let\u2019s pass a few headers to our request.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0455c97 elementor-widget elementor-widget-code-highlight\" data-id=\"0455c97\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>import requests\r\nfrom bs4 import BeautifulSoup\r\n\r\ntarget_url=\"https:\/\/www.amazon.com\/dp\/B0BSHF7WHW\"\r\n\r\nheaders={\"accept-language\": \"en-US,en;q=0.9\",\"accept-encoding\": \"gzip, deflate, br\",\"User-Agent\":\"Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/111.0.0.0 Safari\/537.36\",\"accept\": \"text\/html,application\/xhtml+xml,application\/xml;q=0.9,image\/avif,image\/webp,image\/apng,*\/*;q=0.8,application\/signed-exchange;v=b3;q=0.7\"}\r\n\r\nresp = requests.get(target_url, headers=headers)\r\n\r\nprint(resp.text)<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6067189 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"6067189\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Once you run this code you might be able to bypass the anti-scraping protection wall of Amazon.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7fe3e7e font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"7fe3e7e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now let\u2019s decide what exact information we want to scrape from the page.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-5f1c234 e-con-full e-flex e-con e-child\" data-id=\"5f1c234\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-c1550c3 elementor-widget elementor-widget-heading\" data-id=\"c1550c3\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What are we going to scrape from Amazon?<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-923f201 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"923f201\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>It is always great to decide in advance what are you going to extract from the\u00a0<a href=\"https:\/\/www.amazon.com\/dp\/B0BSHF7WHW\" target=\"_blank\" rel=\"nofollow noopener\"><span class=\"font-600\">target page<\/span><\/a>. This way we can analyze in advance which element is placed where inside the DOM.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d69dd86 elementor-widget elementor-widget-image\" data-id=\"d69dd86\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"800\" height=\"471\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-6-1.jpg\" class=\"attachment-large size-large wp-image-7086\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-6-1.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-6-1-300x176.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-6-1-768x452.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-85526e6 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"85526e6\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<figure style=\"margin: 1em 40px; color: #292929; font-family: charter, Georgia, Cambria, 'Times New Roman', Times, serif; letter-spacing: -0.06px;\"><figcaption style=\"box-sizing: inherit; color: rgba(0, 0, 0, 0.54); font-family: medium-content-sans-serif-font, 'Lucida Grande', 'Lucida Sans Unicode', 'Lucida Sans', Geneva, Arial, sans-serif; line-height: 20px; margin: 0.5em auto 1em; max-width: 728px; text-align: center;\">Product details we are going to scrape from Amazon<\/figcaption><\/figure>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8e79d5a font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"8e79d5a\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We are going to scrape five data elements from the page.<\/p><ul class=\"wp-block-list\"><li><b><em>Name of the product<\/em><\/b><\/li><li><b><\/b><strong><em>Images<\/em><\/strong><\/li><li><b><em>Price (Most important)<\/em><\/b><\/li><li><b><em>Rating<\/em><\/b><\/li><li><em><b>Specs<\/b><\/em><\/li><\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6a8fe6c font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"6a8fe6c\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"942a\">First, we are going to make the GET request to the target page using the\u00a0<code class=\"font-600\">requests<\/code>\u00a0library and then using BS4 we are going to parse out this data. Of course, there are multiple other libraries like\u00a0<code class=\"font-600\">lxml<\/code> that can be used in place of BS4, but BS4 has the most powerful and easy-to-use API.<\/p><p id=\"a789\">Before making the request we are going to analyze the page and find the location of each element inside the DOM. One should always do this exercise to identify the location of each element.<\/p><p id=\"8b03\">We are going to do this by simply using the developer tool. This can be accessed by right-clicking on the target element and then clicking on the <strong>i<\/strong><strong class=\"font-600\">nspect<\/strong>. This is the most common method, you might already know this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-a8e295c e-con-full e-flex e-con e-child\" data-id=\"a8e295c\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-786b7fe elementor-widget elementor-widget-heading\" data-id=\"786b7fe\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Identifying the location of each element<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-54bd17e elementor-widget elementor-widget-heading\" data-id=\"54bd17e\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Location of the title tag<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8c30875 elementor-widget elementor-widget-image\" data-id=\"8c30875\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"158\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-7-1.jpg\" class=\"attachment-large size-large wp-image-7101\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-7-1.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-7-1-300x59.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-7-1-768x152.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d689bc0 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"d689bc0\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<figure style=\"margin: 1em 40px; color: #292929; font-family: charter, Georgia, Cambria, 'Times New Roman', Times, serif; letter-spacing: -0.06px;\"><figcaption style=\"box-sizing: inherit; color: rgba(0, 0, 0, 0.54); font-family: medium-content-sans-serif-font, 'Lucida Grande', 'Lucida Sans Unicode', 'Lucida Sans', Geneva, Arial, sans-serif; line-height: 20px; margin: 0.5em auto 1em; max-width: 728px; text-align: center;\">Identifying location of title tag in source code of amazon website<\/figcaption><\/figure>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e2e1957 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"e2e1957\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"559e\">Once you inspect the\u00a0<code>title<\/code>\u00a0you will find that the title text is located inside the\u00a0<code><strong class=\"font-600\">h1 tag<\/strong><\/code>\u00a0with the\u00a0<code><strong class=\"font-600\">id title<\/strong><\/code>.<\/p>\n<p id=\"9c50\">Coming back to our\u00a0<code><strong class=\"font-600\">amazon.py<\/strong><\/code>\u00a0file, we will write the code to extract this information from Amazon.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-90ee34a elementor-widget elementor-widget-code-highlight\" data-id=\"90ee34a\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>import requests\r\nfrom bs4 import BeautifulSoup\r\n\r\nl=[]\r\no={}\r\n\r\n\r\nurl=\"https:\/\/www.amazon.com\/dp\/B0BSHF7WHW\"\r\n\r\nheaders={\"accept-language\": \"en-US,en;q=0.9\",\"accept-encoding\": \"gzip, deflate, br\",\"User-Agent\":\"Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/111.0.0.0 Safari\/537.36\",\"accept\": \"text\/html,application\/xhtml+xml,application\/xml;q=0.9,image\/avif,image\/webp,image\/apng,*\/*;q=0.8,application\/signed-exchange;v=b3;q=0.7\"}\r\n\r\nresp = requests.get(url, headers=headers)\r\nprint(resp.status_code)\r\n\r\nsoup=BeautifulSoup(resp.text,'html.parser')\r\n\r\n\r\ntry:\r\n    o[\"title\"]=soup.find('h1',{'id':'title'}).text.strip()\r\nexcept:\r\n    o[\"title\"]=None\r\n\r\n\r\n\r\n\r\n\r\nprint(o)<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-de5d95a font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"de5d95a\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"5ec7\">Here the line <code class=\"font-600\">soup=BeautifulSoup(resp.text,\u2019html.parser\u2019)<\/code>\u00a0is using the BeautifulSoup library to create a BeautifulSoup object from an HTTP response text, with the specified HTML parser.<\/p>\n<p id=\"da65\">Then using\u00a0<code><strong>soup.find()<\/code><\/strong>\u00a0method will return the first occurrence of the\u00a0<code><strong>tag h1<\/code><\/strong>\u00a0with\u00a0<code><strong>id title<\/code><\/strong>. We are using\u00a0<code><strong>.text<\/code><\/strong>\u00a0method to get the text from that element. Then finally I used\u00a0<code><strong>.strip()<\/code><\/strong>\u00a0method to remove all the whitespaces from the text we receive.<\/p>\n<p id=\"433a\">Once you run this code you will get this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-db710c8 elementor-widget elementor-widget-code-highlight\" data-id=\"db710c8\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>[{'title': 'Apple 2023 MacBook Pro Laptop M2 Pro chip with 12\u2011core CPU and 19\u2011core GPU: 16.2-inch Liquid Retina XDR Display, 16GB Unified Memory, 1TB SSD Storage. Works with iPhone\/iPad; Space Gray'}]<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4cc9eba font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"4cc9eba\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>If you have not read the above section where we talked about downloading HTML data from the target page then you won\u2019t be able to understand the above code. So, please read the above section before moving ahead.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dbd2761 elementor-widget elementor-widget-heading\" data-id=\"dbd2761\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Location of the image tag<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-160f5cf font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"160f5cf\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>This might be the most tricky part of this complete tutorial. Let\u2019s inspect and find out why it is a little tricky.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1c32dca elementor-widget elementor-widget-image\" data-id=\"1c32dca\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"301\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-8.jpg\" class=\"attachment-large size-large wp-image-7120\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-8.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-8-300x113.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-8-768x289.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8368344 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"8368344\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<figure class=\"wp-block-image size-full\"><figcaption class=\"wp-element-caption\">Inspecting image tag in the source code of amazon website<\/figcaption><\/figure>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a8981c5 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"a8981c5\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tAs you can see the\u00a0<code><strong class=\"font-600\">img tag<\/code><\/strong>\u00a0in which the image is hidden is stored inside\u00a0<code><strong class=\"font-600\">div tag<\/code><\/strong>\u00a0with class\u00a0<code><strong class=\"font-600\">imgTagWrapper<\/strong><\/code>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c09042e elementor-widget elementor-widget-code-highlight\" data-id=\"c09042e\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>allimages = soup.find_all(\"div\",{\"class\":\"imgTagWrapper\"})\r\nprint(len(allimages))<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ef9678e font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"ef9678e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"4763\">Once you print this it will return 3. Now, there are 6 images and we are getting just 3. The reason behind this is JS rendering. Amazon loads its images through an AJAX request at the backend. That\u2019s why we never receive these images when we make an HTTP connection to the page through\u00a0<code><strong class=\"font-600\">requests<\/strong><\/code>\u00a0library.<\/p>\n<p id=\"4243\">Finding high-resolution images is not as simple as finding the title tag. But I will explain to you step by step how you can find all the images of the product.<\/p>\n\n<ol class=\"wp-block-list\">\n \t<li>Copy any product image URL from the page.<\/li>\n \t<li>Then click on the view page source to open the source page of the target webpage.<\/li>\n \t<li>Then search for this image.<\/li>\n<\/ol>\n<p id=\"b2ce\">You will find that all the images are stored as a value for\u00a0<code><strong class=\"font-600\">hiRes<\/strong><\/code>\u00a0key.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-558669f elementor-widget elementor-widget-image\" data-id=\"558669f\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"207\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-9.jpg\" class=\"attachment-large size-large wp-image-7128\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-9.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-9-300x78.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-9-768x198.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-03077d2 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"03077d2\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"280d\">All this information is stored inside a\u00a0<code>script tag<\/code>. Now, here we will use\u00a0<a href=\"https:\/\/www.scrapingdog.com\/blog\/web-scraping-with-python\/#Regular_Expression\" target=\"_blank\" rel=\"noopener\"><span class=\"font-600\">regular expressions<\/span><\/a> to find this pattern of <code><code> <\/code><\/code><strong>hiRes\u201d:\u201dimage_url\u201d<\/strong><\/p><p id=\"672b\">We can still use BS4 but it will make the process a little lengthy and it might slow down our scraper. For now, we will use\u00a0<code><code><\/code><\/code><strong>(.+?)<\/strong> <a href=\"https:\/\/www.computerworld.com\/article\/2786107\/regular-expression-tutorial-part-5--greedy-and-non-greedy-quantification.html\" target=\"_blank\" rel=\"nofollow noopener\"><span class=\"font-600\">non-greedy<\/span><\/a>\u00a0matches for one or more characters. Let me explain what each character in this expression means.<\/p><ul class=\"wp-block-list\"><li>The\u00a0<code>.<\/code>\u00a0matches any character except a newline<\/li><li>The\u00a0<code>+<\/code>\u00a0matches one or more occurrences of the preceding character.<\/li><li>The\u00a0<code>?<\/code>\u00a0makes the match non-greedy, meaning that it will match the minimum number of characters needed to satisfy the pattern.<\/li><\/ul><p id=\"d7df\">The regular expression will return all the matched sequences of characters from the HTML string we are going to pass.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-14bfa27 elementor-widget elementor-widget-code-highlight\" data-id=\"14bfa27\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>images = re.findall('\"hiRes\":\"(.+?)\"', resp.text)\r\no[\"images\"]=images<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ae5efcd font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"ae5efcd\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>This will return all the high-resolution images of the product in a list. In general, it is not advised to use regular expression in data parsing but it can do wonders sometimes.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3466e3a elementor-widget elementor-widget-image\" data-id=\"3466e3a\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"159\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-10.jpg\" class=\"attachment-large size-large wp-image-7146\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-10.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-10-300x60.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-10-768x153.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-88cee5d elementor-widget elementor-widget-heading\" data-id=\"88cee5d\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">\nParsing the price tag<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7006547 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"7006547\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>There are two price tags on the page, but we will only extract the one which is just below the rating.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c48ee98 elementor-widget elementor-widget-image\" data-id=\"c48ee98\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"244\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-11.jpg\" class=\"attachment-large size-large wp-image-7147\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-11.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-11-300x92.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-11-768x235.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-095d733 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"095d733\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWe can see that the price tag is stored inside\u00a0<strong class=\"font-600\"><code>span tag<\/code><\/strong>\u00a0with class\u00a0<strong class=\"font-600\"><code>a-price<\/code><\/strong>. Once you find this tag you can find the first\u00a0child <strong class=\"font-600\"><code>span tag<\/code><\/strong>\u00a0to get the price. Here is how you can do it.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1d3c1c8 elementor-widget elementor-widget-code-highlight\" data-id=\"1d3c1c8\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>try:\r\n    o[\"price\"]=soup.find(\"span\",{\"class\":\"a-price\"}).find(\"span\").text\r\nexcept:\r\n    o[\"price\"]=None<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ecb5c6c font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"ecb5c6c\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"559e\">Once you print\u00a0<code class=\"font-600\">object o<\/code>, you will get to see the price.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6d39107 elementor-widget elementor-widget-code-highlight\" data-id=\"6d39107\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>{'price': '$2,499.00'}<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dc7ac1e elementor-widget elementor-widget-heading\" data-id=\"dc7ac1e\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">\nExtract rating<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a1b943a elementor-widget elementor-widget-image\" data-id=\"a1b943a\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"329\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-12.jpg\" class=\"attachment-large size-large wp-image-7155\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-12.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-12-300x124.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-12-768x316.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-524d11e font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"524d11e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"559e\">You can find the rating in the first\u00a0<code><strong class=\"font-600\">i tag<\/strong><\/code>\u00a0with class\u00a0<code><strong class=\"font-600\">a-icon-star<\/strong><\/code>. Let\u2019s see how to scrape this too.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e42cff8 elementor-widget elementor-widget-code-highlight\" data-id=\"e42cff8\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>try:\r\n    o[\"rating\"]=soup.find(\"i\",{\"class\":\"a-icon-star\"}).text\r\nexcept:\r\n    o[\"rating\"]=None<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f97fa7d font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"f97fa7d\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>It will return this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7e65246 elementor-widget elementor-widget-code-highlight\" data-id=\"7e65246\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>{'rating': '4.1 out of 5 stars'}<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6475f2e font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"6475f2e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"559e\">In the same manner, we can scrape the specs of the device.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6532f88 elementor-widget elementor-widget-heading\" data-id=\"6532f88\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">\nExtract the specs of the device\n<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-294693f elementor-widget elementor-widget-image\" data-id=\"294693f\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"434\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-13.jpg\" class=\"attachment-large size-large wp-image-7167\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-13.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-13-300x163.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-13-768x416.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-073d825 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"073d825\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"559e\">These specs are stored inside these tr tags with class a-spacing-small. Once you find these you have to find both the span under it to find the text. You can see this in the above image. Here is how it can be done.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c0fa101 elementor-widget elementor-widget-code-highlight\" data-id=\"c0fa101\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>specs_arr=[]\r\nspecs_obj={}\r\n\r\nspecs = soup.find_all(\"tr\",{\"class\":\"a-spacing-small\"})\r\n\r\nfor u in range(0,len(specs)):\r\n    spanTags = specs[u].find_all(\"span\")\r\n    specs_obj[spanTags[0].text]=spanTags[1].text\r\n\r\n\r\nspecs_arr.append(specs_obj)\r\no[\"specs\"]=specs_arr<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ad06ccd font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"ad06ccd\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"1002\">Using\u00a0<code><strong class=\"font-600\">.find_all()<\/strong><\/code>\u00a0we are finding all the\u00a0<code><strong class=\"font-600\">tr tags<\/strong><\/code>\u00a0with class\u00a0<code><strong class=\"font-600\">a-spacing-small<\/strong><\/code>. Then we are running a\u00a0<code><strong class=\"font-600\">for loop<\/strong><\/code>\u00a0to iterate over all the\u00a0<code><strong class=\"font-600\">tr tags<\/strong><\/code>. Then under\u00a0<code><strong class=\"font-600\">for loop<\/strong><\/code>\u00a0we find all the\u00a0<code><strong class=\"font-600\">span tags<\/strong><\/code>. Then finally we are extracting the\u00a0<code><strong class=\"font-600\">text<\/strong><\/code>\u00a0from each\u00a0<code><strong class=\"font-600\">span tag<\/strong><\/code>.<\/p>\n<p id=\"e9ee\">Once you print the\u00a0<code><strong class=\"font-600\">object o<\/strong><\/code>\u00a0it will look like this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-474cc49 elementor-widget elementor-widget-image\" data-id=\"474cc49\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"277\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-14.jpg\" class=\"attachment-large size-large wp-image-7176\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-14.jpg 828w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-14-300x104.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-14-768x266.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-26c999e font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"26c999e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Throughout the tutorial, we have used\u00a0<a href=\"https:\/\/www.w3schools.com\/python\/python_try_except.asp\" target=\"_blank\" rel=\"nofollow noopener\">try\/except<\/a>\u00a0statements to avoid any run time error. We have not managed to scrape all the data we decided to scrape at the beginning of the tutorial.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-2891e0e e-con-full e-flex e-con e-child\" data-id=\"2891e0e\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-5ba45ac elementor-widget elementor-widget-heading\" data-id=\"5ba45ac\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Complete Code<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-36aa748 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"36aa748\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"2c4c\">You can of course make a few changes to the code to extract more data because the page is filled with large information. You can even use cron jobs to mail yourself an alert when the price drops. Or you can integrate this technique into your app, this feature can mail your users when the price of any item on Amazon drops.<\/p><p id=\"fd7e\">But for now, the code will look like this.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4a23e14 elementor-widget elementor-widget-code-highlight\" data-id=\"4a23e14\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>import requests\r\nfrom bs4 import BeautifulSoup\r\nimport re\r\n\r\nl=[]\r\no={}\r\nspecs_arr=[]\r\nspecs_obj={}\r\n\r\ntarget_url=\"https:\/\/www.amazon.com\/dp\/B0BSHF7WHW\"\r\n\r\nheaders={\"accept-language\": \"en-US,en;q=0.9\",\"accept-encoding\": \"gzip, deflate, br\",\"User-Agent\":\"Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/111.0.0.0 Safari\/537.36\",\"accept\": \"text\/html,application\/xhtml+xml,application\/xml;q=0.9,image\/avif,image\/webp,image\/apng,*\/*;q=0.8,application\/signed-exchange;v=b3;q=0.7\"}\r\n\r\nresp = requests.get(target_url, headers=headers)\r\nprint(resp.status_code)\r\nif(resp.status_code != 200):\r\n    print(resp)\r\nsoup=BeautifulSoup(resp.text,'html.parser')\r\n\r\n\r\ntry:\r\n    o[\"title\"]=soup.find('h1',{'id':'title'}).text.lstrip().rstrip()\r\nexcept:\r\n    o[\"title\"]=None\r\n\r\n\r\nimages = re.findall('\"hiRes\":\"(.+?)\"', resp.text)\r\no[\"images\"]=images\r\n\r\ntry:\r\n    o[\"price\"]=soup.find(\"span\",{\"class\":\"a-price\"}).find(\"span\").text\r\nexcept:\r\n    o[\"price\"]=None\r\n\r\ntry:\r\n    o[\"rating\"]=soup.find(\"i\",{\"class\":\"a-icon-star\"}).text\r\nexcept:\r\n    o[\"rating\"]=None\r\n\r\n\r\nspecs = soup.find_all(\"tr\",{\"class\":\"a-spacing-small\"})\r\n\r\nfor u in range(0,len(specs)):\r\n    spanTags = specs[u].find_all(\"span\")\r\n    specs_obj[spanTags[0].text]=spanTags[1].text\r\n\r\n\r\nspecs_arr.append(specs_obj)\r\no[\"specs\"]=specs_arr\r\nl.append(o)\r\n\r\n\r\nprint(l)<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-4f0bb4e e-con-full e-flex e-con e-child\" data-id=\"4f0bb4e\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-a404051 elementor-widget elementor-widget-heading\" data-id=\"a404051\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Changing Headers on every request<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e3d6741 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"e3d6741\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>With the above code, your scraping journey will come to a halt, once Amazon recognizes a pattern in the request.<\/p><p>To avoid this you can keep changing your headers to keep the scraper running. You can rotate a bunch of headers to overcome this challenge. Here is how it can be done.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e3ae578 elementor-widget elementor-widget-code-highlight\" data-id=\"e3ae578\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>import requests\r\nfrom bs4 import BeautifulSoup\r\nimport re\r\nimport random\r\n\r\nl=[]\r\no={}\r\nspecs_arr=[]\r\nspecs_obj={}\r\n\r\nuseragents=['Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12_2) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.88 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4894.117 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_4) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4855.118 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.88 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4892.86 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_4) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4854.191 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4859.153 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.79 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36\/null',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36,gzip(gfe)',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_4) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4895.86 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 12_3_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_13) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4860.89 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4885.173 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4864.0 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_12) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4877.207 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 12_2_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.60 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_6) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit\/537.36 (KHTML%2C like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.133 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_16_0) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.75 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_12) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4872.118 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 12_3_1) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.88 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 11_13) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4876.128 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit\/537.36 (KHTML%2C like Gecko) Chrome\/100.0.4896.127 Safari\/537.36',\r\n    'Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/100.0.4896.127 Safari\/537.36']\r\n\r\ntarget_url=\"https:\/\/www.amazon.com\/dp\/B0BSHF7WHW\"\r\n\r\nheaders={\"User-Agent\":useragents[random.randint(0,31)],\"accept-language\": \"en-US,en;q=0.9\",\"accept-encoding\": \"gzip, deflate, br\",\"accept\": \"text\/html,application\/xhtml+xml,application\/xml;q=0.9,image\/avif,image\/webp,image\/apng,*\/*;q=0.8,application\/signed-exchange;v=b3;q=0.7\"}\r\n\r\nresp = requests.get(target_url,headers=headers)\r\nprint(resp.status_code)\r\nif(resp.status_code != 200):\r\n    print(resp)\r\nsoup=BeautifulSoup(resp.text,'html.parser')\r\n\r\n\r\ntry:\r\n    o[\"title\"]=soup.find('h1',{'id':'title'}).text.lstrip().rstrip()\r\nexcept:\r\n    o[\"title\"]=None\r\n\r\n\r\nimages = re.findall('\"hiRes\":\"(.+?)\"', resp.text)\r\no[\"images\"]=images\r\n\r\ntry:\r\n    o[\"price\"]=soup.find(\"span\",{\"class\":\"a-price\"}).find(\"span\").text\r\nexcept:\r\n    o[\"price\"]=None\r\n\r\ntry:\r\n    o[\"rating\"]=soup.find(\"i\",{\"class\":\"a-icon-star\"}).text\r\nexcept:\r\n    o[\"rating\"]=None\r\n\r\n\r\nspecs = soup.find_all(\"tr\",{\"class\":\"a-spacing-small\"})\r\n\r\nfor u in range(0,len(specs)):\r\n    spanTags = specs[u].find_all(\"span\")\r\n    specs_obj[spanTags[0].text]=spanTags[1].text\r\n\r\n\r\nspecs_arr.append(specs_obj)\r\no[\"specs\"]=specs_arr\r\nl.append(o)\r\n\r\n\r\nprint(l)<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e3462dc font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"e3462dc\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9580\">We are using a random library here to generate random numbers between 0 and 31(31 is the length of the useragents list). These user agents are all latest so you can easily bypass the anti-scraping wall.<\/p><p id=\"2a7d\">But again this technique is not enough to scrape Amazon at scale. What if you want to scrape millions of such pages? Then this technique is super inefficient because your IP will be blocked. So, for mass scraping one has to use a\u00a0<strong class=\"font-600\">web scraping proxy API<\/strong>\u00a0to avoid getting blocked while scraping.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-c0f97de e-con-full e-flex e-con e-child\" data-id=\"c0f97de\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-f351e7c elementor-widget elementor-widget-heading\" data-id=\"f351e7c\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Using Scrapingdog for scraping Amazon<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-55e116c font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"55e116c\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9e4d\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">The advantages of using Scrapingdog\u2019s\u00a0<a class=\"af mf\" href=\"https:\/\/www.scrapingdog.com\/amazon-scraper-api\/\" target=\"_blank\" rel=\"nofollow noopener\">Amazon Scraper API<\/a>\u00a0are:<\/p><ul class=\"\"><li id=\"bced\" class=\"lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me mg mh mi bk\" data-selectable-paragraph=\"\"><strong class=\"lj fr\"><em class=\"mj\">You won\u2019t have to manage headers anymore.<\/em><\/strong><\/li><li id=\"6fc8\" class=\"lh li fq lj b lk mk lm ln lo ml lq lr ls mm lu lv lw mn ly lz ma mo mc md me mg mh mi bk\" data-selectable-paragraph=\"\"><strong class=\"lj fr\"><em class=\"mj\">Every request will go through a new IP. This keeps your IP anonymous<\/em><\/strong>.<\/li><li id=\"19bd\" class=\"lh li fq lj b lk mk lm ln lo ml lq lr ls mm lu lv lw mn ly lz ma mo mc md me mg mh mi bk\" data-selectable-paragraph=\"\"><strong class=\"lj fr\"><em class=\"mj\">Our API will automatically retry on its own if the first hit fails.<\/em><\/strong><\/li><li id=\"a827\" class=\"lh li fq lj b lk mk lm ln lo ml lq lr ls mm lu lv lw mn ly lz ma mo mc md me mg mh mi bk\" data-selectable-paragraph=\"\"><strong class=\"lj fr\"><em class=\"mj\">Scrapingdog will handle issues like changes in HTML tags. You won\u2019t have to check every time for changes in tags. You can focus on data collection.<\/em><\/strong><\/li><\/ul><p id=\"cf89\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Let me show you how easy it is to scrape Amazon product pages using Scrapingdog with just an ASIN code. It would be great if you could read the\u00a0<a class=\"af mf\" href=\"https:\/\/docs.scrapingdog.com\/amazon-scraper-api\" target=\"_blank\" rel=\"nofollow noopener\">documentation<\/a>\u00a0first before trying the API.<\/p><p id=\"f0ec\" class=\"pw-post-body-paragraph lh li fq lj b lk ll lm ln lo lp lq lr ls lt lu lv lw lx ly lz ma mb mc md me fj bk\" data-selectable-paragraph=\"\">Before you try the API you have to\u00a0<a class=\"af mf\" href=\"https:\/\/api.scrapingdog.com\/register\" target=\"_blank\" rel=\"nofollow noopener\">signup<\/a> for the free pack. The free pack comes with 1000 credits which is enough for testing Amazon scraper API.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e2c8fde elementor-widget elementor-widget-code-highlight\" data-id=\"e2c8fde\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>import requests\r\n\r\nurl = \"https:\/\/api.scrapingdog.com\/amazon\/product\"\r\nparams = {\r\n    \"api_key\": \"Your-API-Key\",\r\n    \"domain\": \"com\",\r\n    \"asin\": \"B0C22KCKVQ\"\r\n}\r\n\r\nresponse = requests.get(url, params=params)\r\n\r\nif response.status_code == 200:\r\n    data = response.json()\r\n    print(data)\r\nelse:\r\n    print(f\"Request failed with status code {response.status_code}\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-966820e font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"966820e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"d3e4\">Once you run this code you will get this beautiful JSON response.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a28e3db elementor-widget elementor-widget-image\" data-id=\"a28e3db\" data-element_type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"213\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-19-1024x272.jpg\" class=\"attachment-large size-large wp-image-7197\" alt=\"\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-19-1024x272.jpg 1024w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-19-300x80.jpg 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-19-768x204.jpg 768w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2024\/08\/image-19.jpg 1400w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e434c12 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"e434c12\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"d3e4\">This JSON contains almost all the data you see on the Amazon product page.\u00a0<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b653c4c elementor-widget elementor-widget-heading\" data-id=\"b653c4c\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Scraping Amazon data based on Postal Codes<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c7ad216 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"c7ad216\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"d3e4\">Now, let\u2019s scrape the data for a particular postal code. For this example, we are going to target New York.\u00a0<code class=\"cx op oq or mv b\">10001<\/code>\u00a0is the postal code of New York.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0a3bc31 elementor-widget elementor-widget-code-highlight\" data-id=\"0a3bc31\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp> import requests\n  \n  api_key = \"Your-API-Key\"\n  url = \"https:\/\/api.scrapingdog.com\/amazon\/product\"\n  \n  params = {\n      \"api_key\": api_key,\n      \"asin\": \"B0CTKXMQXK\",\n      \"domain\": \"com\",\n      \"postal_code\": \"10001\",\n      \"country\": \"us\"\n  }\n  \n  response = requests.get(url, params=params)\n  \n  if response.status_code == 200:\n      data = response.json()\n      print(data)\n  else:\n      print(f\"Request failed with status code: {response.status_code}\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6829e36 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"6829e36\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"d3e4\">Once you run this code you will get a beautiful JSON response based on the New York Location.<\/p><p>\u00a0<\/p><p><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:1400\/1*qnZVsIMkVhmez-lAI_YxEA.png\" \/><\/p><p>I have also created a video to guide you using Scrapingdog to scrape Amazon.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8c9c6df elementor-widget elementor-widget-video\" data-id=\"8c9c6df\" data-element_type=\"widget\" data-settings=\"{&quot;youtube_url&quot;:&quot;https:\\\/\\\/www.youtube.com\\\/watch?v=mGROpJfPZ6w&amp;t=8s&quot;,&quot;video_type&quot;:&quot;youtube&quot;,&quot;controls&quot;:&quot;yes&quot;}\" data-widget_type=\"video.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-wrapper elementor-open-inline\">\n\t\t\t<div class=\"elementor-video\"><\/div>\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-391bee3 e-con-full e-flex e-con e-child\" data-id=\"391bee3\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-865fdcb elementor-widget elementor-widget-heading\" data-id=\"865fdcb\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Scraping Amazon Offers Data Using Scrapingdog<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-00288a2 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"00288a2\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>This data will help you identify details about the seller, delivery options, pricing, etc.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cd07254 elementor-widget elementor-widget-code-highlight\" data-id=\"cd07254\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>import requests\r\n\r\nurl = \"https:\/\/api.scrapingdog.com\/amazon\/offers\"\r\n\r\nparams = {\r\n    \"api_key\": \"your-api-key\",\r\n    \"asin\": \"B0BVJT3HVN\",\r\n    \"domain\": \"com\",\r\n    \"country\": \"us\"\r\n}\r\n\r\ntry:\r\n    response = requests.get(url, params=params)\r\n    response.raise_for_status()  # Raise error for bad responses\r\n    data = response.json()\r\n    print(\"\u2705 API Response:\")\r\n    print(data)\r\nexcept requests.exceptions.RequestException as e:\r\n    print(f\"\u274c Request failed: {e}\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e5a8b84 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"e5a8b84\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"57ab\" class=\"pw-post-body-paragraph mu mv gm mw b mx og mz na nb oh nd ne nf oi nh ni nj oj nl nm nn ok np nq nr gf bl\" data-selectable-paragraph=\"\">After running this code you will get this JSON response.<\/p><p data-selectable-paragraph=\"\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-31557 size-full\" src=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2025\/11\/1_4Cdajqca_MZpTZSvUZG6eA-1.png\" alt=\"\" width=\"1100\" height=\"822\" srcset=\"https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2025\/11\/1_4Cdajqca_MZpTZSvUZG6eA-1.png 1100w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2025\/11\/1_4Cdajqca_MZpTZSvUZG6eA-1-300x224.png 300w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2025\/11\/1_4Cdajqca_MZpTZSvUZG6eA-1-1024x765.png 1024w, https:\/\/www.scrapingdog.com\/wp-content\/uploads\/2025\/11\/1_4Cdajqca_MZpTZSvUZG6eA-1-768x574.png 768w\" sizes=\"(max-width: 1100px) 100vw, 1100px\" \/><\/p><p id=\"5eaa\" class=\"pw-post-body-paragraph mu mv gm mw b mx og mz na nb oh nd ne nf oi nh ni nj oj nl nm nn ok np nq nr gf bl\" data-selectable-paragraph=\"\">In addition to this, if you\u2019re building a keyword research tool, validating product ideas, or running sentiment analysis, you can use\u00a0<a href=\"https:\/\/docs.scrapingdog.com\/amazon-scraper-api\/amazon-autocomplete-scraper\" target=\"_blank\" rel=\"noopener\"><strong class=\"mw gn\">Scrapingdog\u2019s Amazon Autocomplete API<\/strong><\/a>\u00a0for these use cases.<\/p><p data-selectable-paragraph=\"\">You just have to make a GET request to this endpoint\u00a0<code class=\"de ov ow ox ny b\"><\/code><a class=\"ah oy\" href=\"https:\/\/api.scrapingdog.com\/amazon\/autocomplete\" target=\"_blank\" rel=\"noopener\">https:\/\/api.scrapingdog.com\/amazon\/autocomplete<\/a>\u00a0and pass your target keyword. For example let\u2019s say you are looking for a pen holder then you will pass a\u00a0<strong class=\"mw gn\"><em class=\"oz\">prefix<\/em><\/strong>\u00a0by the name \u201c<strong class=\"mw gn\">pen holder<\/strong>\u201d.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9f7790c elementor-widget elementor-widget-code-highlight\" data-id=\"9f7790c\" data-element_type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-javascript line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-javascript\">\n\t\t\t\t\t<xmp>import requests\r\n\r\n# API URL and key\r\napi_url = \"https:\/\/api.scrapingdog.com\/amazon\/autocomplete\"\r\napi_key = \"your-api-key\"\r\n\r\n# Search parameters\r\ndomain = \"com\"\r\nprefix = \"pen holder\"\r\n\r\n# Create a dictionary with the query parameters\r\nparams = {\r\n    \"api_key\": api_key,\r\n    \"prefix\": prefix\r\n}\r\n\r\n# Send the GET request with the specified parameters\r\nresponse = requests.get(api_url, params=params)\r\n\r\n# Check if the request was successful (status code 200)\r\nif response.status_code == 200:\r\n    data = response.json()\r\n    print(data)\r\nelse:\r\n    print(f\"HTTP Request Error: {response.status_code}\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6fdb055 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"6fdb055\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"fm fn fo fp fq m\"><article><div class=\"m\"><div class=\"m\"><section><div><div class=\"gf gg gh gi gj\"><div class=\"ac ci\"><div class=\"cp bi fr fs ft fu\"><p id=\"8944\" class=\"pw-post-body-paragraph mu mv gm mw b mx og mz na nb oh nd ne nf oi nh ni nj oj nl nm nn ok np nq nr gf bl\" data-selectable-paragraph=\"\">This will generate a list of keywords associated with the prefix.<\/p><\/div><\/div><\/div><\/div><\/section><\/div><\/div><\/article><\/div>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1dfe085 elementor-widget elementor-widget-heading\" data-id=\"1dfe085\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Conclusion<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5fc22bc font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"5fc22bc\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Over 80% of the e-commerce businesses today rely on web scraping. If you&#8217;re not using it, you&#8217;re already falling behind.\u00a0<\/p><p>There are many marketplaces that you can scrape &amp; extract data from. Having a strategy to <a href=\"https:\/\/www.scrapingdog.com\/blog\/scraping-e-commerce-data\/\" target=\"_blank\" rel=\"noopener\">scrape e-commerce data<\/a> for your product can take you far ahead of your competitors.\u00a0<\/p><p id=\"48e7\">In this tutorial, we scraped various data elements from Amazon. First, we used the requests library to download the raw HTML, and then using BS4 we parsed the data we wanted. You can also use lxml in place of BS4 to extract data. Python and its libraries make scraping very simple for even a beginner. Once you scale, you can switch to web scraping APIs to scrape millions of such pages.<\/p><p id=\"2c8f\">Combination of\u00a0<code>requests<\/code>\u00a0and Scrapingdog\u00a0can help you scale your scraper. You will get more than a 99% success rate while scraping Amazon with Scrapingdog.<\/p><p>If you want to track the price of a product on Amazon, we have a comprehensive tutorial on\u00a0<a href=\"https:\/\/www.scrapingdog.com\/blog\/build-amazon-price-tracker\/\" target=\"_blank\" rel=\"noreferrer noopener\" data-type=\"link\" data-id=\"https:\/\/www.scrapingdog.com\/blog\/build-amazon-price-tracker\/\"><span class=\"font-600\">tracking Amazon product prices using Python<\/span><\/a>.\u00a0<\/p><p id=\"bbc4\">I hope you like this little tutorial. If you do, please don&#8217;t forget to share it with your friends and on your social media.<\/p><p>You can combine this data with <a href=\"https:\/\/www.saasadviser.co\/software\/business-plan-software\" target=\"_blank\" rel=\"noopener\">business plan software<\/a> to offer different solutions to your clients.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a6f97db font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"a6f97db\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>If you are a non-developer and wanted to scrape the data from Amazon, here is a good news for you.<br \/>We have recently launched a Google Sheet add-on Amazon Scraper.\u00a0<br \/><br \/>Here is the video \ud83c\udfa5 tutorial for this action.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fbbadcf elementor-widget elementor-widget-video\" data-id=\"fbbadcf\" data-element_type=\"widget\" data-settings=\"{&quot;youtube_url&quot;:&quot;https:\\\/\\\/www.youtube.com\\\/watch?v=_x-gLbDjsJw&quot;,&quot;video_type&quot;:&quot;youtube&quot;,&quot;controls&quot;:&quot;yes&quot;}\" data-widget_type=\"video.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-wrapper elementor-open-inline\">\n\t\t\t<div class=\"elementor-video\"><\/div>\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-b00694c e-con-full e-flex e-con e-child\" data-id=\"b00694c\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-f4dbda6 elementor-widget elementor-widget-heading\" data-id=\"f4dbda6\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Frequently Asked Questions\n<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6bafe04 faq font-color-green elementor-widget elementor-widget-accordion\" data-id=\"6bafe04\" data-element_type=\"widget\" data-widget_type=\"accordion.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-accordion\">\n\t\t\t\t\t\t\t<div class=\"elementor-accordion-item\">\n\t\t\t\t\t<div id=\"elementor-tab-title-1121\" class=\"elementor-tab-title\" data-tab=\"1\" role=\"button\" aria-controls=\"elementor-tab-content-1121\" aria-expanded=\"false\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-accordion-icon elementor-accordion-icon-right\" aria-hidden=\"true\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-accordion-icon-closed\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"15\" height=\"17\" viewBox=\"0 0 15 17\" fill=\"none\"><path d=\"M7.72559 15.75L7.72559 0.75\" stroke=\"#3EA380\" stroke-width=\"1.5\" stroke-linecap=\"round\" stroke-linejoin=\"round\"><\/path><path d=\"M1.70124 9.70019L7.72524 15.7502L13.7502 9.7002\" stroke=\"#3EA380\" stroke-width=\"1.5\" stroke-linecap=\"round\" stroke-linejoin=\"round\"><\/path><\/svg><\/span>\n\t\t\t\t\t\t\t\t<span class=\"elementor-accordion-icon-opened\"><\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t<a class=\"elementor-accordion-title\" tabindex=\"0\">Is scraping Amazon allowed?<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t\t<div id=\"elementor-tab-content-1121\" class=\"elementor-tab-content elementor-clearfix\" data-tab=\"1\" role=\"region\" aria-labelledby=\"elementor-tab-title-1121\"><div class=\"ea-card ea-expand sp-ea-single\"><div id=\"collapse32990\" class=\"sp-collapse spcollapse collapsed show\" role=\"region\" data-parent=\"#sp-ea-3299\" aria-labelledby=\"ea-header-32990\"><div class=\"ea-body\"><p>Yes, scraping Amazon is allowed as long as you are scraping public information. Extracting private data can cause you problems and legal actions can be taken against it.<\/p><\/div><\/div><\/div><div class=\"ea-card sp-ea-single\"><h3 class=\"ea-header\">\u00a0<\/h3><\/div><\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<div class=\"elementor-accordion-item\">\n\t\t\t\t\t<div id=\"elementor-tab-title-1122\" class=\"elementor-tab-title\" data-tab=\"2\" role=\"button\" aria-controls=\"elementor-tab-content-1122\" aria-expanded=\"false\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-accordion-icon elementor-accordion-icon-right\" aria-hidden=\"true\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-accordion-icon-closed\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"15\" height=\"17\" viewBox=\"0 0 15 17\" fill=\"none\"><path d=\"M7.72559 15.75L7.72559 0.75\" stroke=\"#3EA380\" stroke-width=\"1.5\" stroke-linecap=\"round\" stroke-linejoin=\"round\"><\/path><path d=\"M1.70124 9.70019L7.72524 15.7502L13.7502 9.7002\" stroke=\"#3EA380\" stroke-width=\"1.5\" stroke-linecap=\"round\" stroke-linejoin=\"round\"><\/path><\/svg><\/span>\n\t\t\t\t\t\t\t\t<span class=\"elementor-accordion-icon-opened\"><\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t<a class=\"elementor-accordion-title\" tabindex=\"0\">How does Amazon detect scrapers?<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t\t<div id=\"elementor-tab-content-1122\" class=\"elementor-tab-content elementor-clearfix\" data-tab=\"2\" role=\"region\" aria-labelledby=\"elementor-tab-title-1122\"><div><div id=\"sp_easy_accordion-1725097724\"><div id=\"sp-ea-3299\" class=\"sp-ea-one sp-easy-accordion\" data-ex-icon=\"minus\" data-col-icon=\"plus\" data-ea-active=\"ea-click\" data-ea-mode=\"vertical\" data-preloader=\"\" data-scroll-active-item=\"\" data-offset-to-scroll=\"0\"><div class=\"ea-card sp-ea-single ea-expand\"><div id=\"collapse32991\" class=\"sp-collapse spcollapse show\" role=\"region\" data-parent=\"#sp-ea-3299\" aria-labelledby=\"ea-header-32991\"><div class=\"ea-body\"><p>Amazon detects scraping by the anti-bot mechanism which can check your IP address and thus can block you if you continue to scrape it. However, using a proxy management system will help you to bypass this security measure.<\/p><\/div><\/div><\/div><\/div><\/div><\/div><\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-6a205f8 e-con-full e-flex e-con e-child\" data-id=\"6a205f8\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-31463b4 elementor-widget elementor-widget-heading\" data-id=\"31463b4\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Additional Resources<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-41bf9f6 font-color-green elementor-widget elementor-widget-text-editor\" data-id=\"41bf9f6\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"1d00\">Here are a few additional resources that you may find helpful during your web scraping journey:<\/p><ul class=\"wp-block-list\"><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-amazon-reviews\/\" target=\"_blank\" rel=\"noreferrer noopener\" data-type=\"link\" data-id=\"https:\/\/www.scrapingdog.com\/blog\/scrape-amazon-reviews\/\"><span class=\"font-600\">How to Scrape Amazon Review using Python<\/span><\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/webscraping-problems\/amazon-captcha-bypass-and-avoid-ip-ban\/\" target=\"_blank\" rel=\"noopener\" data-type=\"link\" data-id=\"https:\/\/www.scrapingdog.com\/webscraping-problems\/web-scraping-blocked\/avoid-ip-ban-when-scraping-amazon\"><span class=\"font-600\">How To Avoid IP Bans &amp; CAPTCHA while Scraping Amazon<\/span><\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/amazon-price-tracker-using-scrapingdog-amazon-scraper-api-and-make\/\" target=\"_blank\" rel=\"noopener\">Automate Amazon Price Tracking using Scrapingdog&#8217;s Amazon Scraper API &amp; Make.com<\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-walmart\/\" target=\"_blank\" rel=\"noreferrer noopener\"><span class=\"font-600\">How to Scrape Walmart using Python<\/span><\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-flipkart\/\" target=\"_blank\" rel=\"noopener\">How to Scrape Flipkart using Python<\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/web-scraping-myntra\/\" target=\"_blank\" rel=\"noreferrer noopener\" data-type=\"link\" data-id=\"https:\/\/www.scrapingdog.com\/blog\/web-scraping-myntra\/\"><span class=\"font-600\">Web Scraping Myntra using Python<\/span><\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-ebay\/\" target=\"_blank\" rel=\"noreferrer noopener\" data-type=\"URL\" data-id=\"https:\/\/www.scrapingdog.com\/blog\/scrape-ebay\/\"><span class=\"font-600\">Web Scraping eBay using Python<\/span><\/a><\/li><li><a href=\"https:\/\/www.scrapingdog.com\/blog\/scrape-google-shopping\/\">Web Scraping Google Shopping using Python<\/a><\/li><\/ul>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-804f2b3 e-con-full web-scraping-right-con elementor-hidden-desktop elementor-hidden-tablet e-flex e-con e-child\" data-id=\"804f2b3\" data-element_type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;,&quot;sticky&quot;:&quot;top&quot;,&quot;sticky_on&quot;:[&quot;desktop&quot;,&quot;tablet&quot;],&quot;sticky_parent&quot;:&quot;yes&quot;,&quot;sticky_offset&quot;:0,&quot;sticky_effects_offset&quot;:0}\">\n\t\t<div class=\"elementor-element elementor-element-706aad6 e-con-full e-flex e-con e-child\" data-id=\"706aad6\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-ecc8f03 elementor-widget elementor-widget-heading\" data-id=\"ecc8f03\" data-element_type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Web Scraping with Scrapingdog<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ebd456e elementor-widget elementor-widget-text-editor\" data-id=\"ebd456e\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tScrape the web without the hassle of getting blocked\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-713efce e-con-full e-flex e-con e-child\" data-id=\"713efce\" data-element_type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-be4cbe4 elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"be4cbe4\" data-element_type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/api.scrapingdog.com\/register\" target=\"_blank\" rel=\"noopener\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Try for Free<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-95927eb elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"95927eb\" data-element_type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/share.hsforms.com\/1ex4xYy1pTt6rrqFlRAquwQ4h1b2\" target=\"_blank\" rel=\"noopener\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Contact sales<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>We have used Python to extract data from Amazon Listing. Further, we have given a solution to avoid getting blocked by Amazon when scraping in heavy volume. <\/p>\n","protected":false},"author":5,"featured_media":18054,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[25,84],"tags":[],"class_list":["post-5524","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog","category-python"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts\/5524","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/comments?post=5524"}],"version-history":[{"count":9,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts\/5524\/revisions"}],"predecessor-version":[{"id":31591,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/posts\/5524\/revisions\/31591"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/media\/18054"}],"wp:attachment":[{"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/media?parent=5524"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/categories?post=5524"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.scrapingdog.com\/wp-json\/wp\/v2\/tags?post=5524"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}