python - scrapy avoid crawler logging out -


i using scrapy library facilitate crawling website.

the website uses authentication , can login page using scrapy.

the page has url log out user , destroy session.

how ensure scrapy avoids logout page when crawling?

if using link extractors , don't want follow particular "logout" link, can set deny property:

rules = [rule(sgmllinkextractor(deny=[r'logout/']), follow=true),] 

another option check response.url inside spider's parse method:

def parse(self, response):     if 'logout' in response.url:         return      # extract items 

hope helps.


Comments

Popular posts from this blog

javascript - Count length of each class -

What design pattern is this code in Javascript? -

hadoop - Restrict secondarynamenode to be installed and run on any other node in the cluster -