Scrapy 1.4 Documentationcss('div.quote'): yield { 'text': quote.css('span.text::text').extract_first(), 'author': quote.xpath('span/small/text()').extract_first(), } next_page if next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 394 页 | 589.10 KB | 1 年前3
 Scrapy 1.7 Documentationresponse): for quote in response.css('div.quote'): yield { 'text': quote.css('span.text::text').get(), 'author': quote.xpath('span/small/text()').get(), } (continues on next page) 5 Scrapy Documentation a::attr("href")').get() if next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: list of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 306 页 | 1.23 MB | 1 年前3
 Scrapy 2.4 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider quotes.jl file a list of the quotes in JSON Lines format, containing text and author, looking like this: {"author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady, who has ˓→not pleasure0 码力 | 354 页 | 1.39 MB | 1 年前3
 Scrapy 1.8 Documentationresponse): for quote in response.css('div.quote'): yield { 'text': quote.css('span.text::text').get(), 'author': quote.xpath('span/small/text()').get(), } next_page = response.css('li.next a::attr("href")') a::attr("href")').get() if next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: list of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 335 页 | 1.44 MB | 1 年前3
 Scrapy 2.3 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider list of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 352 页 | 1.36 MB | 1 年前3
 Scrapy 2.2 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider list of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 348 页 | 1.35 MB | 1 年前3
 Scrapy 2.1 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider list of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 342 页 | 1.32 MB | 1 年前3
 Scrapy 2.0 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider list of the quotes in JSON format, containing text and author, looking like this (reformatted here for better readability): [{ "author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady0 码力 | 336 页 | 1.31 MB | 1 年前3
 Scrapy 2.6 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider quotes.jl file a list of the quotes in JSON Lines format, containing text and author, looking like this: {"author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady, who has␣ ˓→not pleasure0 码力 | 384 页 | 1.63 MB | 1 年前3
 Scrapy 2.5 Documentationquote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page = response.css('li.next a::attr("href")').get() if next_page next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider quotes.jl file a list of the quotes in JSON Lines format, containing text and author, looking like this: {"author": "Jane Austen", "text": "\u201cThe person, be it gentleman or lady, who has␣ ˓→not pleasure0 码力 | 366 页 | 1.56 MB | 1 年前3
共 62 条
- 1
 - 2
 - 3
 - 4
 - 5
 - 6
 - 7
 













