美丽的汤网在我的Q

from urllib import urlopen from bs4 import BeautifulSoup html = urlopen("http://www.officialcharts.com/charts/singles- chart/19800203/7501/" ) bsObj = BeautifulSoup(html) nameList = bsObj. findAll("div" , {"class" : "artist",}) for name in nameList: print(name. get_text()) html = urlopen("http://www.officialcharts.com/charts/singles- chart/19800203/7501/" ) bsObj = BeautifulSoup(html) nameList = bsObj. findAll("div" , {"class" : "title"}) for name in nameList: print(name. get_text())

1条回答

网友

1楼 · 发布于 2024-10-01 22:31:57

所以这里有几件事需要解决。在

docs on PyMySQL非常擅长让您启动和运行。在

不过，在将这些内容放入数据库之前，您需要以艺术家和歌曲名称相互关联的方式获取它们。现在你得到的是一份艺术家和歌曲的单独列表，无法将它们联系起来。您将需要迭代title artist类来执行此操作。在

我会这样做的-

from urllib import urlopen
from bs4 import BeautifulSoup
import pymysql.cursors

# Webpage connection
html = urlopen("http://www.officialcharts.com/charts/singles-chart/19800203/7501/")

# Grab title-artist classes and iterate
bsObj = BeautifulSoup(html)
recordList = bsObj.findAll("div", {"class" : "title-artist",})

# Now iterate over recordList to grab title and artist
for record in recordList:
     title = record.find("div", {"class": "title",}).get_text().strip()
     artist = record.find("div", {"class": "artist"}).get_text().strip()
     print artist + ': ' + title

这将打印记录列表循环每次迭代的标题和艺术家。在

为了将这些值插入MySQL数据库，我创建了一个名为artist_song的表，其内容如下：

^{pr2}$

这不是最干净的方法，但这个想法是合理的。我们要打开一个到MySQL数据库的连接（我已经将我的DB称为top_40），并为recordList循环的每次迭代插入一个艺术家/标题对：

from urllib import urlopen
from bs4 import BeautifulSoup
import pymysql.cursors


# Webpage connection
html = urlopen("http://www.officialcharts.com/charts/singles-chart/19800203/7501/")

# Grab title-artist classes and store in recordList
bsObj = BeautifulSoup(html)
recordList = bsObj.findAll("div", {"class" : "title-artist",})

# Create a pymysql cursor and iterate over each title-artist record.
# This will create an INSERT statement for each artist/pair, then commit
# the transaction after reaching the end of the list. pymysql does not
# have autocommit enabled by default. After committing it will close
# the database connection.
# Create database connection

connection = pymysql.connect(host='localhost',
                             user='root',
                             password='password',
                             db='top_40',
                             charset='utf8mb4',
                             cursorclass=pymysql.cursors.DictCursor)

try:
    with connection.cursor() as cursor:
        for record in recordList:
            title = record.find("div", {"class": "title",}).get_text().strip()
            artist = record.find("div", {"class": "artist"}).get_text().strip()
            sql = "INSERT INTO `artist_song` (`artist`, `song`) VALUES (%s, %s)"
            cursor.execute(sql, (artist, title))
    connection.commit()
finally:
    connection.close()

编辑：根据我的评论，我认为迭代表行会更清晰：

from urllib import urlopen
from bs4 import BeautifulSoup
import pymysql.cursors


# Webpage connection
html = urlopen("http://www.officialcharts.com/charts/singles-chart/19800203/7501/")

bsObj = BeautifulSoup(html)

rows = bsObj.findAll('tr')
for row in rows:
    if row.find('span', {'class' : 'position'}):
        position = row.find('span', {'class' : 'position'}).get_text().strip()
        artist = row.find('div', {'class' : 'artist'}).get_text().strip()
        track = row.find('div', {'class' : 'title'}).get_text().strip()

相关问题更多 >

编程相关推荐

热门问题

热门文章