Selenium 解析器

0 投票
0 回答
17 浏览
提问于 2025-04-12 06:38

我在Python中写了一个Selenium解析器。在本地的服务器上运行得很好,但在AWS的EC2服务器上却出现了以下错误,我该如何解决呢?

错误信息是:ERR HTTPConnectionPool(host='localhost', port=50687): 最大重试次数超过,无法访问网址:/session/56a774d4a576540ea13a5de796af6b8a/execute/sync(原因是NewConnectionError('<urllib3.connection.HTTPConnection object at 0x796147a135e0>: 无法建立新的连接:[Errno 111] 连接被拒绝'))

这是我的dockerfile

FROM python:3.10

WORKDIR /service
COPY requirements.txt ./

RUN apt-get update && \
    apt-get install -y \
    wget \
    ca-certificates \
    fonts-noto \
    libxss1 \
    libappindicator3-1 \
    fonts-liberation \
    xdg-utils \
    gnupg

RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
    echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list && \
    apt-get update && \
    apt-get install -y google-chrome-stable


ENV CHROME_BIN=/usr/bin/google-chrome-stable

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 80

CMD ["python", "manage.py", "runserver", "0.0.0.0:80"]

这是我的docker-compose.yml文件

services:
  worker-1:
    build: .
    volumes:
      - ./service:/service
    command: python manage.py runserver 0.0.0.0:80
    ports:
      - '80:80'
    restart:  on-failure

  worker-2:
    build: .
    volumes:
      - ./service:/service
    command: python manage.py via_parser
    restart: on-failure

这是我开始的文件via_parser.py

 while True:
            print("circle start")
            try:
                chrome_options = Options()
                ua = UserAgent()
                userAgent = ua.random
                chrome_options.add_argument("--headless")
                print(userAgent)
                chrome_options.add_argument(f"user-agent={userAgent}")
                chrome_options.add_argument("--disable-extensions")
                chrome_options.add_argument("--disable-application-cache")
                chrome_options.add_argument("--disable-gpu")
                chrome_options.add_argument("--no-sandbox")
                chrome_options.add_argument("--disable-setuid-sandbox")
                chrome_options.add_argument("--disable-dev-shm-usage")
                chrome_options.add_argument("--disable-blink-features=AutomationControlled")
                # chrome_options.binary_location = '/usr/bin/google-chrome-stable'
                # driver = uc.Chrome(options=chrome_options, version_main=122)
                chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
                chrome_options.add_experimental_option('useAutomationExtension', False)
                driver = webdriver.Chrome(options=chrome_options)

                stealth(driver,
                        languages=["en-US", "en"],
                        vendor="Google Inc.",
                        # platform="Win64",
                        platform="Linux x86_64",
                        webgl_vendor="Intel Inc.",
                        renderer="Intel Iris OpenGL Engine",
                        fix_hairline=True,
                        )

我尝试过更改和添加add_argument,但没有任何帮助

0 个回答

暂无回答

撰写回答