VGOEmulator.net

A Development Project for the Vanguard:Saga of Heroes MMO

Skip to content

Advanced search
  • Quick links
    • Unanswered topics
    • Active topics
    • Search
  • Portal
  • Project Manager
  • Bug Tracker
  • Server List
  • Wiki
  • Donate
  • Login
  • Register
  • Board index Community Off-Topic
  • Search

Wayback Machine

Random thoughts...

Moderator: Community Managers

Post Reply
  • Print view
Advanced search
4 posts • Page 1 of 1
Deltharien
Posts: 52
Joined: Sun May 18, 2014 12:19 pm
Wayback Machine
  • Quote

Post by Deltharien » Sat Aug 02, 2014 8:19 pm

So I have a question for the nerds. I found a cached copy of vgtact.com on the wayback machine and most of the pages are intact. If I save them manually (save link as...) then I get a working copy of the page, minus the images. Vgtact has loads of sub-pages images, so I tried to mirror the site using wget, which failed, and then httrack, which also failed, just not as miserably. There's a freaky redirect to https://archive.org/.../index.php/Main_Page that gets triggered by my attempts - meaning as the links are crawled, I just end up with folders full of copies of 'Main_Page'.

Any advice on how to accomplish my end goal? Or has someone already done this? Thanks.
Top

User avatar
John Adams
Retired
Posts: 4582
Joined: Wed Aug 28, 2013 9:40 am
Location: Phoenix, AZ.
Contact:
Contact John Adams
Website
Re: Wayback Machine
  • Quote

Post by John Adams » Sat Aug 02, 2014 11:18 pm

HTTrack has an option on the Spiders tab (I think) about following robot rules. You can choose to ignore those, which may be rejecting the link searches. I never had a lot of luck HTTracking wayback, so the VGOPlayers pieces you see were literally me saving as every page, then pointing directly to each image element (web slices) on there or vanguardthegame.com til I got what I needed. Exhausting. Hope I don't need more, since vanguardthegame is also now gone
Top

User avatar
falloutdc
Data Collector
Data Collector
Posts: 248
Joined: Sat Mar 29, 2014 10:04 am
Contact:
Contact falloutdc
Website
Re: Wayback Machine
  • Quote

Post by falloutdc » Sun Aug 03, 2014 4:33 am

Same httrack failed on vanguardthegame as it uses css files to create references to images and httrack does not modify these, had to download literally hundreds of images by hand..

The forums it downloaded fine however (some css files are missing i think but these are on the waybackarchive)

Back to the poster as you did not gave a link i did a search and the third result was this
https://web.archive.org/web/20140803113 ... ive.65538/
So send a tell to zewtastic if he still has the database
If not it should be possible to create rulesets to stop crawling certain links
Duplicates don't hurt ;) (tm John Adams 2014)
Image
Top

Deltharien
Posts: 52
Joined: Sun May 18, 2014 12:19 pm
Re: Wayback Machine
  • Quote

Post by Deltharien » Sun Aug 03, 2014 7:38 am

Thanks for the replies. I'd love to see the data from both vgtact and vanguardcrafters resurrected. The newer sites were a valiant effort but still missing so much.
Top


Post Reply
  • Print view

4 posts • Page 1 of 1

Return to “Off-Topic”

Jump to
  • Information
  • ↳   Announcements
  • ↳   Dev Chats
  • ↳   Events
  • Community
  • ↳   General Discussions
  • ↳   VGO Team Help Requests
  • ↳   Introductions
  • ↳   Game Features
  • ↳   Wish List
  • ↳   Off-Topic
  • Support
  • ↳   How-To's
  • ↳   General Support
  • ↳   Windows
  • ↳   Linux
  • Bugs
  • ↳   Server Bugs
  • ↳   Server Bugs (Closed)
  • ↳   Content Bugs
  • ↳   Content Bugs (Closed)
  • ↳   Database Bugs
  • ↳   Tools Bugs
  • Board index
  • All times are UTC-07:00
  • Delete cookies
  • Contact us
Powered by phpBB® Forum Software © phpBB Limited
*Original Author: Brad Veryard
*Updated to 3.2 by MannixMD