MCA [RAW]

CONCEPTS

01HTTP requests (GET, POST)

02Parsing HTML with BeautifulSoup

03Finding elements by tag/class/id

04Extracting attributes and text

05Handling pagination

06Respecting robots.txt

SYNTAX_DEMO

Extracting data from the web

import requests
from bs4 import BeautifulSoup

url = "https://example.com"
response = requests.get(url)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "html.parser")
    # Get the title
    title = soup.title.string
    print("Page Title:", title)
    
    # Find all links
    for link in soup.find_all("a"):
        print(link.get("href"))