Develop
[Scraping] Colab Web Scraping 설정하기
너드나무
2024. 10. 28. 19:23
반응형
Google Colab 접속
Google Colab
colab.research.google.com
PIP 설치
!python --version
!pip install selenium
!apt-get update
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
Python 3.10.12
Collecting selenium
Downloading selenium-4.25.0-py3-none-any.whl.metadata (7.1 kB)
~~~~
Processing triggers for man-db (2.10.2-1) ...
Processing triggers for dbus (1.12.20-2ubuntu4.1) ...
cp: '/usr/lib/chromium-browser/chromedriver' and '/usr/bin/chromedriver' are the same file
- 인스턴스 내 python 버전을 확인하고 pip, apt 명령어를 통해 환경을 세팅한다.
Selenium 테스트
- Chrome Webdriver service 별도 지정 시 argument 관련 Exception 발생
- 기본 경로 설정을 통해 기능 정상 동작을 확인하였다.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.chrome.options import Options as ChromeOptions
from selenium.webdriver.common.by import By
options = ChromeOptions()
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
options.add_argument('user-agent=' + user_agent)
options.add_argument("lang=ko_KR")
options.add_argument('headless')
options.add_argument('window-size=1920x1080')
options.add_argument("disable-gpu")
options.add_argument("--no-sandbox")
# chrome driver
driver = webdriver.Chrome(options=options)
# 페이지 URL 설정
base_url = "https://google.com/"
driver.get(url)
driver.quit()728x90
반응형