[Firefox] site scraping extension

Solved
eddy32 Posted messages 42 Status Membre -  
 Fran -
Hello (or good evening)

I am looking for an extension to scrape a website using Firefox.

Why? Because I was using HTTrack until now.
Because the site I want to scrape (only part of it) requires authentication, and once this authentication is established, the site automatically redirects to one and only one page, that of its profile.
I have tried many tricks with HTTrack but none have worked.

So I suppose that by starting directly from the browser which allows me to navigate to the part I'm interested in, I could carry out this surgical absorption.
I suppose.

Thank you in advance for clarifying whether such a possibility exists or not.

@+
e>>y
--
23 times 3 = 3 times 23 = sixty-nine
Configuration: Windows XP Firefox 2.0.0.4

4 réponses

sebsauvage Posted messages 33284 Registration date   Status Modérateur Last intervention   15 684
 
The ScrapBook extension can do that, although it is considerably slower than HTTrack.
16
eddy32 Posted messages 42 Status Membre
 
Sure, thank you, I will take a look at this one.

On my side, I found SpiderZilla which seems to work with HTTrack. http://spiderzilla.mozdev.org/installation.html

I wasn't searching with the right words, I was typing "web scraper" and got nothing. But when I found the word "spider," Mozilla didn't suggest ScrapBook. I will take a look when I finish my tests with SpiderZilla. And I will update you on the capabilities of this latest extension with sites secured by a login and password.

See you later
--
23 times 3 = 3 times 23 = sixty-nine
0
eddy32 Posted messages 42 Status Membre
 
SpiderZilla didn't work. Very well made in the spirit of HTTrack, but unable to scrape a site protected by a login and password.

SCRAPBOOK
http://amb.vis.ne.jp/mozilla/scrapbook/
IS THE BEST


Nothing to say, ScrapBook is really the best. It did its job in the simplest way possible, with no special settings (except for finding the FireFox profile directory to place the archive in a personal folder).
And also, you need to ensure that the authentication session is still active, so change pages to refresh the session from time to time while ScrapBook is working.

Awesome, really great, a huge thank you to you Sebsauvage.

e>>y
--
23 times 3 = 3 times 23 = sixty-nine
0
isa
 
Hello!
Well, I used to use Scrapbook perfectly, but I just tried to capture a site with a password by logging in beforehand. It loads everything, but when I open the pages, it always redirects me to the login page :(
Any solution?
Thank you in advance.
0