Saving Threads

Surge goes pre

Surge goes pre

Desert Nomad

Join Date: Mar 2008

I would never play off Occupy Wall St. for my guild name

We Are the 1 Percent

Me/

Hi All,

As we know, gwguru is closing down.

I would like to go back and save all the collection threads and any others that people want in text files so they can be accessed in the future. They have great information.

I'm trying to decide the best way to do this. Lets talk in context of the collections thread.

http://www.guildwarsguru.com/forum/y...t10463144.html

It's 115 pages long so obviously just re-typing every post is not an option. There are pictures too.

I had 2 ideas:

1. If one could pull all the html for every page, it would be 115 html files, couldn't those pages be replicated elsewhere exactly? I can trivially pull the html files for every page using matlab and save them all... Then someone could take those html files and replicate the current threads exactly so the information isn't lost.

2. Again, I can pull the html files for every page of the thread. I could make a computer code that goes through said html file and saves all the post information and discards all the other html garbage and saves it as a text file... This isn't as trivial as just pulling the html to automate it for them all.

Keep in mind I'm no computer expert. But I use computers for work to code stuff and stuff =P.

What are people's thoughts on the best way to save old threads people care about? Are these the best ideas? Am I forgetting something?

Cheers,
Buzan





EDIT::::

I've saved 24 threads so far. In the future I will post which ones at some point maybe. Next will be the ones here.

wind fire and ice

wind fire and ice

Jungle Guide

Join Date: Oct 2008

There

[ToA]

Hey surge, I think that our two reasonable options are either screen shots(if hosted somewhere reliable will outlast the original pictures that we would be linking in the new threads) or your option of actually transporting the webpages.

Depending on work involved with either, and who is capable of either, I think these are the two reasonable options. I think having archival screenshots is an OK and probably easier solution that lasts forever rather than until the original uploads break, but people may want the feeling of scrolling through the same pages again. I'm not sure.


If we wanted to be really hardcore about preserving GW Guru while we're all working towards this, we could do your solution with new uploads for the images. We can assign threads and have people go through and re-upload the old images to new hosts to then be inserted back into the new threads. This is very elaborate of course, but would mean our cherished old threads stay alive forever(or until people lose interest to maintain) instead of slowly decaying as old image hosts die/drop the images/etc.

Marty Silverblade

Marty Silverblade

Administrator

Join Date: Jun 2006

You could take the second option further. If you can strip out all of the html tags and whatnot from the raw thread data and get it into .csv format (or something similarly suitable) you could then make a new (offline) .html file that reads that data and displays it in a format similar to Guru. Preserving the images is a different issue altogether.

Not sure if that would be satisfactory but it would be a viable thing to, given enough know-how.

Surge goes pre

Surge goes pre

Desert Nomad

Join Date: Mar 2008

I would never play off Occupy Wall St. for my guild name

We Are the 1 Percent

Me/

Marty,

I've been experimenting.

I can pull the html from a guru page and open it perfectly in firefox or safari. Even the pictures which are embedded from imgur seem to load in.

I think I'm gonna save everything I personally want this way.

Buzan

Smoke Nightvogue

Smoke Nightvogue

Lion's Arch Merchant

Join Date: Sep 2005

Moscow, Russia.

Random Ascalon Fools, Soldiers of Thunderstorm.

R/Mo

Quote:
Originally Posted by Surge goes pre View Post
I've already posted in multiple places that I'm going to try to preserve the knowledge. And I believe I have the computer know how to save a lot of it. I've already saved 5 threads with 300+ pages of posts to my hard drive by ripping the HTML. These were the 5 threads that I most wanted. That amounted to 50MB of HTML code. I'm happy to fill my 32GB flash drive with guru pages. But...

TELL ME WHAT KNOWLEDGE (THREAD URLS) YOU WANT ME TO SAVE AND I WILL ADD THEM TO RIPPING CODE.

My goal is exactly what you are worried about but no one seems to want to help. Saving the knowledge. So everyone (not pointing this at Max - general statement) help me do it instead of telling people it's a goal. I have a plan - it may not be perfect, but it's the only one I've heard that seems feasible. All I need the URL of the thread you want to save. If no one tells me and no one comes up with a better plan, the knowledge is gone. But at least I have a plan.

Later we can decide how to make the information available again. But at least a copy will be saved!! Then we decide how to make it available. I don't know if the best way is through a drop box, or on legacy or gw2guru, or a different website with only these files.

So far only cosy has asked me to save any thread - the armor dye thread. And it will be added to my code when I'm home later.


As I said, if the community decides to all go to legacy, I go too. If it's split, I'll be split. If it's all to gw2guru, I'll go there. Wherever I go I wanna be the forum historian .
Alright, then, here are those I'd wish to see saved in their entirety:

1) Challenge to all the hardcore Pre-Searing people
2) paw·ned²: All-In-One Team Build & Template Editor
3) LDoA under 17 hours
4) Death Emote
5) Borat's Guide To Pugging Correctly
6) GvG - Improvement
7) The essential reminiscing of good times 'thread'...
8) Report of a very dangerous bug
9) do you remember when
10) Players with R15?
11) Lol paragon gvg
12) Ha in crisis
13) Beginner's Guide to Guild vs. Guild Battles
14) Avatar of Lyssa
15) The Aspiring Drunkard's Guide to Binge Drinking
16) Index of Ideas - Check this thread and use SEARCH before posting ANY new thread (this pretty much needs saving of all of the threads contained therein)
17) The best guilds in game
18) Announcing GWLP
19) HA golden ages...
20) Pokemon M Aster's Ele ball on the loose
21) Leeloof got r15
22) Legendary Hero
23) Rank Fifteen
24) GvG Split Idea
25) Rank 15
26) My HA-builds
27) Guide to beating Zergway in HA
28) Rank 15
29) HoH Relic Run
30) Interrupter Bot Program, is it possible?
31) Question on Higher ranked teams
32) [Guide] Guild Wars Errors - Explanations & Solutions
33) What is the best HA-guild at the moment?
34) Interrupter guide? Tips?
35) MOST 6v6 HoH holds ???
36) Famous people and guilds in HA
37) Screenshot of the Rank 12 Emote
38) I *think* they finally fixed the droknar armor in ascalon arena exploit!
39) Hi. Read me before you post stupid monk questions.
40) You have died 0 times @ Hell's Precipice
41) guild wars: is it really all skill?
42) Top Alpha Testers Announced!
43) Happy Birthday Spooky!
44) Omnia Mutantur, Nihil Interit

And thanks in advance for this initiative of yours!

Surge goes pre

Surge goes pre

Desert Nomad

Join Date: Mar 2008

I would never play off Occupy Wall St. for my guild name

We Are the 1 Percent

Me/

There is some real good ones there =P

http://www.guildwarsguru.com/forum/f...t10168895.html
http://www.guildwarsguru.com/forum/b...t10119932.html

These 2 have been saved.

Got my work cut out for me adding all these haha.

Shasgaliel

Shasgaliel

Jungle Guide

Join Date: Apr 2008

[bomb]

This one please:

http://www.guildwarsguru.com/forum/g...t10488969.html

Without it some players would not be able to play at all.

Surge goes pre

Surge goes pre

Desert Nomad

Join Date: Mar 2008

I would never play off Occupy Wall St. for my guild name

We Are the 1 Percent

Me/

Quote:
Originally Posted by Shasgaliel View Post
This one please:

http://www.guildwarsguru.com/forum/g...t10488969.html

Without it some players would not be able to play at all.
I think this one is an important one ^

Piippo

Piippo

Academy Page

Join Date: Jul 2009

Finland

E/

Appreciate the archiving efforts immensely, there's so much history on this site that deserves to be preserved. Off the top of my head, this thread in particular comes to mind:

http://www.guildwarsguru.com/forum/w...t10379144.html

140 pages of pure gold, gathered over 6 years, with some absolute classics in there.

bsoltan

bsoltan

Site Contributor

Join Date: Dec 2005

UK

[SoF]

Does anyone have advice on the best way to save a thread in this way? Is it something that anyone could do to get the threads that they want?

Cuilan

Cuilan

Forge Runner

Join Date: Mar 2008

Me/

Interesting how IPB (what Guild Wars 2 Guru runs) lets you download topic pages with a simple button, but this installation of vB doesn't.

Surge goes pre

Surge goes pre

Desert Nomad

Join Date: Mar 2008

I would never play off Occupy Wall St. for my guild name

We Are the 1 Percent

Me/

Bsoltan, I don't think anyone can do it. At least not my way. I use the 'urlread('http://www.guildwarsguru.com/..........')' matlab function and then fprint to write to file.

Kattar

Kattar

EXCESSIVE FLUTTERCUSSING

Join Date: Mar 2007

SMS (lolgw2placeholder)

Me/

Quote:
Originally Posted by Cuilan View Post
Interesting how IPB (what Guild Wars 2 Guru runs) lets you download topic pages with a simple button, but this installation of vB doesn't.
This version of vB hasn't been modified since before Curse purchased the website.

Smoke Nightvogue

Smoke Nightvogue

Lion's Arch Merchant

Join Date: Sep 2005

Moscow, Russia.

Random Ascalon Fools, Soldiers of Thunderstorm.

R/Mo

Quote:
Originally Posted by bsoltan View Post
Does anyone have advice on the best way to save a thread in this way? Is it something that anyone could do to get the threads that they want?
Quote:
Originally Posted by Surge goes pre View Post
Bsoltan, I don't think anyone can do it. At least not my way. I use the 'urlread('http://www.guildwarsguru.com/..........')' matlab function and then fprint to write to file.
Here's the link to the program I'm trying to save the entire forum content with at the moment. Not sure how it all turns out in the end, though, since GWG is not simply a regular website, but a database-utilizing PHP application, which makes it much more complicated for the script when it comes to retaining the existing level of data consistency.

bsoltan

bsoltan

Site Contributor

Join Date: Dec 2005

UK

[SoF]

Quote:
Originally Posted by Smoke Nightvogue View Post
Here's the link to the program I'm trying to save the entire forum content with at the moment. Not sure how it all turns out in the end, though, since GWG is not simply a regular website, but a database-utilizing PHP application, which makes it much more complicated for the script when it comes to retaining the existing level of data consistency.
I have been trying the exact same one, takes its time and same as you. I'm not sure what the result will be.

T1Cybernetic

T1Cybernetic

Desert Nomad

Join Date: Sep 2005

Wakefield, West Yorkshire, Uk, Nr Earth

Alternate Evil Gamers [aeg]

N/

Same here I have 11GB saved up to now but I haven't checked it over yet Seems to have done the trick though for now.

I have a feeling it's not going to be this simple but having the entire site and forums for offline viewing is awesome

bsoltan

bsoltan

Site Contributor

Join Date: Dec 2005

UK

[SoF]

It worked! How awesome.

Kvinna

Kvinna

Administrator

Join Date: Aug 2009

As I have just stated in the original thread, GWG will stay on in archive mode after we transition to the GW2G forums. Feel free to continue saving what you want for your personal archives, but just know that we won't be losing any of this precious history!

bsoltan

bsoltan

Site Contributor

Join Date: Dec 2005

UK

[SoF]

Quote:
Originally Posted by bsoltan View Post
It worked! How awesome.
Just as well.. turns out that httrack didn't archive everything. It might all be there but isn't easily linked together browsable from the forum.

Maybe with some fiddling. I'm trying WebCopy as well.

Shayne Hawke

Shayne Hawke

Departed from Tyria

Join Date: May 2007

Clan Dethryche [dth]

R/

Here are some threads to archive, in case you are still doing this:

An Open Letter to ANet
An Open Letter to ANet - Part 2
6 Months of Skill Updates: A Review
Fix Snowball
Ban during finals of monthly automated tournament
Scarred Psyche: An Extensive Guide
Rollerbeetle Racing Top Score Analysis
Why XTH should remain broken
Rollerbeetle Racing Guide : Namkey + Yuri
What's your stand on GW1? You think Anet's doing a good job?
Recent Account Bans
Petition To Demand A Response From Anet On RMT Botters and Exploiters

Ensign put out a lot of good content that should be saved.

Damage Per Second, or How I Learned to Love The Buffs
Focus Shenanigans - How To Break an Energy Denial Lock
Adrenaline - The Details
Numbers for the Cleave / Eviscerate Debate
Why Nuking Sucks
Conceptual Issue With Elementalists
June 2007 Nerf Wishlist
Where Balance Went Awry
General Balance Thoughts
A.Net Doesn't Understand Game Balance

Topics about Ursan:

PvE Balance - Part 2
Do you miss Ursan's Blessing?

Topics about Shadow Form:

Update - Thursday, May 22, 2008
SF Update will destroy GW economy
Did you make an Assassin just because of the perma SF farm?
What happened to Ecto and Shard Prices
[Dev Update] Shadow Form Balance Changes - 2 July 2008
OMG They are going to buff SF
Update - Thursday August 7
essence of celerity = NERFED
Perma is the new PvE?
UWSC going to be nerfed?
Best way to nerf UWSC?
Confirmation that the Live Team is going after SF this year
Will nerfing SF really help anything to do with the game?
UWSC Nerf and Anet's "Progress"
Is this Anet's solution to shadow form?
Shadow Form meets the end
Actually...i think SF should stay.
Preliminary Skill Update Notes: Feb 19
Update - Thursday, February 25, 2010

Surge goes pre

Surge goes pre

Desert Nomad

Join Date: Mar 2008

I would never play off Occupy Wall St. for my guild name

We Are the 1 Percent

Me/

As far as I'm aware, whole site is going to be archived. So I don't need to do this anymore.

Shasgaliel

Shasgaliel

Jungle Guide

Join Date: Apr 2008

[bomb]

I would assume it is only temporary as well. In a year or two when they see there are not many views they will pull the plug. It is safer to keep the copy of what is important in case the notice about potential closure is missed.

Age

Age

Hall Hero

Join Date: Jul 2005

California Canada/BC

STG Administrator

Mo/

Please transfer this one it brought me here as i was more on The Guild Hall.
http://www.guildwarsguru.com/forum/u...980&highlight=
This one to.
http://www.guildwarsguru.com/forum/5...111&highlight=
This if it is possible.
http://www.guildwarsguru.com/forum/g...247&highlight=

T1Cybernetic

T1Cybernetic

Desert Nomad

Join Date: Sep 2005

Wakefield, West Yorkshire, Uk, Nr Earth

Alternate Evil Gamers [aeg]

N/

Those are great threads Age Worthy of saving!

Age

Age

Hall Hero

Join Date: Jul 2005

California Canada/BC

STG Administrator

Mo/

Please don't forget to save those threads and move them over to the GWG2.

Smoke Nightvogue

Smoke Nightvogue

Lion's Arch Merchant

Join Date: Sep 2005

Moscow, Russia.

Random Ascalon Fools, Soldiers of Thunderstorm.

R/Mo

Quote:
Originally Posted by bsoltan View Post
Just as well.. turns out that httrack didn't archive everything. It might all be there but isn't easily linked together browsable from the forum.

Maybe with some fiddling. I'm trying WebCopy as well.
Here's the link to the software capable of actually archiving everything. Just needs a small tuning in relation to how thread/post patterns are processed.

What needs to be specified, is:

Forum type: Generic - vBulletin

Forums and sub-forums

All links contain this string: *-f#.html
Links to next pages contain: *-f#p#.html

Topics

All links contain this string: *-t#.html
Links to next pages contain: *-t#p#.html

And by the start of the next week, if the forum is still going to be out there, it would allow you having a personal copy for offline reading.

Quote:
Originally Posted by Surge goes pre View Post
As far as I'm aware, whole site is going to be archived. So I don't need to do this anymore.
No worries, I'm on it as I'm writing this.