[Python] BeautifulSoup select() CSS 선택자 사용법

BeautifulSoup(html, 'html.parser').select()

select() 메서드는 HTML 문서에서 CSS 선택자 규칙에 맞는 모든 태그를 리스트 형태로 반환

예제 HTML 코드

# HTML 문서를 파싱하여 soup 객체 생성
from bs4 import BeautifulSoup

html = """
<html>
<head><title>The Dormouse's story</title></head>
<body>
  <p class="title"><b>The Dormouse's story</b></p>
  <p class="story">Once upon a time there were three little sisters; and their names were
    <a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
    <a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
    <a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
  </p>
</body>
</html>
"""

soup = BeautifulSoup(html, 'html.parser')

활용

태그 이름으로 찾기

soup.select("title")
# 결과: [<title>The Dormouse's story</title>]

nth-df-type 선택자로 찾기
- 해당 유형의 n번째 요소를 선택

soup.select("p:nth-of-type(1)")
# 결과: [<p class="title"><b>The Dormouse's story</b></p>]

ID로 찾기

soup.select("#link1")
# 결과: [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>]

태그와 ID를 조합하여 찾기

soup.select("a#link2")
# 결과: [<a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>]

태그의 계층 구조로 찾기

soup.select("body a")
# 결과: [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
#        <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
#        <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]

">" 를 사용하여 자식 요소 찾기

soup.select("head > title")
# 결과: [<title>The Dormouse's story</title>]

클래스명으로 찾기

soup.select(".sister")
# 결과: [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
#        <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
#        <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]

속성 선택자로 찾기

soup.select('a[href]')
# 결과: [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>,
#        <a class="sister" href="http://example.com/lacie" id="link2">Lacie</a>,
#        <a class="sister" href="http://example.com/tillie" id="link3">Tillie</a>]

속성 값에 따른 요소 찾기

soup.select('a[href="http://example.com/elsie"]')
# 결과: [<a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>]

Beautiful Soup Documentation — Beautiful Soup 4.12.0 documentation

Beautiful Soup Documentation Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers h

www.crummy.com

728x90

저작자표시 비영리 동일조건 (새창열림)

'Develop' 카테고리의 다른 글

[Python] except 상세 logging 방법 (1)	2024.11.27
[Python] SyntaxError: f-string: unmatched '[' 원인 및 해결 방법 (0)	2024.11.26
[Docker] docker-compose 실행 및 중지, 테스트 방법 (0)	2024.11.25
[Python] extend()로 List 자료형에 List 요소 추가하기 (2)	2024.11.18
[Web] Rendering on the Web (0)	2024.11.12
[OpenAI] GPT Response JSON Schema로 관리하기 (5)	2024.11.07
[OpenAI] 1.0.0 Migration 관련 Function 변경 (3)	2024.11.05
[Scraping] Colab Web Scraping 설정하기 (1)	2024.10.28

간단한게좋다

[Python] BeautifulSoup select() CSS 선택자 사용법

BeautifulSoup(html, 'html.parser').select()

예제 HTML 코드

활용

'Develop' 카테고리의 다른 글

티스토리툴바

[Python] BeautifulSoup select() CSS 선택자 사용법

BeautifulSoup(html, 'html.parser').select()

예제 HTML 코드

활용

'Develop' 카테고리의 다른 글

관련글

티스토리툴바