Saving Threads
Surge goes pre
Hi All,
As we know, gwguru is closing down.
I would like to go back and save all the collection threads and any others that people want in text files so they can be accessed in the future. They have great information.
I'm trying to decide the best way to do this. Lets talk in context of the collections thread.
http://www.guildwarsguru.com/forum/y...t10463144.html
It's 115 pages long so obviously just re-typing every post is not an option. There are pictures too.
I had 2 ideas:
1. If one could pull all the html for every page, it would be 115 html files, couldn't those pages be replicated elsewhere exactly? I can trivially pull the html files for every page using matlab and save them all... Then someone could take those html files and replicate the current threads exactly so the information isn't lost.
2. Again, I can pull the html files for every page of the thread. I could make a computer code that goes through said html file and saves all the post information and discards all the other html garbage and saves it as a text file... This isn't as trivial as just pulling the html to automate it for them all.
Keep in mind I'm no computer expert. But I use computers for work to code stuff and stuff =P.
What are people's thoughts on the best way to save old threads people care about? Are these the best ideas? Am I forgetting something?
Cheers,
Buzan
EDIT::::
I've saved 24 threads so far. In the future I will post which ones at some point maybe. Next will be the ones here.
As we know, gwguru is closing down.
I would like to go back and save all the collection threads and any others that people want in text files so they can be accessed in the future. They have great information.
I'm trying to decide the best way to do this. Lets talk in context of the collections thread.
http://www.guildwarsguru.com/forum/y...t10463144.html
It's 115 pages long so obviously just re-typing every post is not an option. There are pictures too.
I had 2 ideas:
1. If one could pull all the html for every page, it would be 115 html files, couldn't those pages be replicated elsewhere exactly? I can trivially pull the html files for every page using matlab and save them all... Then someone could take those html files and replicate the current threads exactly so the information isn't lost.
2. Again, I can pull the html files for every page of the thread. I could make a computer code that goes through said html file and saves all the post information and discards all the other html garbage and saves it as a text file... This isn't as trivial as just pulling the html to automate it for them all.
Keep in mind I'm no computer expert. But I use computers for work to code stuff and stuff =P.
What are people's thoughts on the best way to save old threads people care about? Are these the best ideas? Am I forgetting something?
Cheers,
Buzan
EDIT::::
I've saved 24 threads so far. In the future I will post which ones at some point maybe. Next will be the ones here.
wind fire and ice
Hey surge, I think that our two reasonable options are either screen shots(if hosted somewhere reliable will outlast the original pictures that we would be linking in the new threads) or your option of actually transporting the webpages.
Depending on work involved with either, and who is capable of either, I think these are the two reasonable options. I think having archival screenshots is an OK and probably easier solution that lasts forever rather than until the original uploads break, but people may want the feeling of scrolling through the same pages again. I'm not sure.
If we wanted to be really hardcore about preserving GW Guru while we're all working towards this, we could do your solution with new uploads for the images. We can assign threads and have people go through and re-upload the old images to new hosts to then be inserted back into the new threads. This is very elaborate of course, but would mean our cherished old threads stay alive forever(or until people lose interest to maintain) instead of slowly decaying as old image hosts die/drop the images/etc.
Depending on work involved with either, and who is capable of either, I think these are the two reasonable options. I think having archival screenshots is an OK and probably easier solution that lasts forever rather than until the original uploads break, but people may want the feeling of scrolling through the same pages again. I'm not sure.
If we wanted to be really hardcore about preserving GW Guru while we're all working towards this, we could do your solution with new uploads for the images. We can assign threads and have people go through and re-upload the old images to new hosts to then be inserted back into the new threads. This is very elaborate of course, but would mean our cherished old threads stay alive forever(or until people lose interest to maintain) instead of slowly decaying as old image hosts die/drop the images/etc.
Marty Silverblade
You could take the second option further. If you can strip out all of the html tags and whatnot from the raw thread data and get it into .csv format (or something similarly suitable) you could then make a new (offline) .html file that reads that data and displays it in a format similar to Guru. Preserving the images is a different issue altogether.
Not sure if that would be satisfactory but it would be a viable thing to, given enough know-how.
Not sure if that would be satisfactory but it would be a viable thing to, given enough know-how.
Surge goes pre
Marty,
I've been experimenting.
I can pull the html from a guru page and open it perfectly in firefox or safari. Even the pictures which are embedded from imgur seem to load in.
I think I'm gonna save everything I personally want this way.
Buzan
I've been experimenting.
I can pull the html from a guru page and open it perfectly in firefox or safari. Even the pictures which are embedded from imgur seem to load in.
I think I'm gonna save everything I personally want this way.
Buzan
Smoke Nightvogue
Quote:
I've already posted in multiple places that I'm going to try to preserve the knowledge. And I believe I have the computer know how to save a lot of it. I've already saved 5 threads with 300+ pages of posts to my hard drive by ripping the HTML. These were the 5 threads that I most wanted. That amounted to 50MB of HTML code. I'm happy to fill my 32GB flash drive with guru pages. But...
TELL ME WHAT KNOWLEDGE (THREAD URLS) YOU WANT ME TO SAVE AND I WILL ADD THEM TO RIPPING CODE. My goal is exactly what you are worried about but no one seems to want to help. Saving the knowledge. So everyone (not pointing this at Max - general statement) help me do it instead of telling people it's a goal. I have a plan - it may not be perfect, but it's the only one I've heard that seems feasible. All I need the URL of the thread you want to save. If no one tells me and no one comes up with a better plan, the knowledge is gone. But at least I have a plan. Later we can decide how to make the information available again. But at least a copy will be saved!! Then we decide how to make it available. I don't know if the best way is through a drop box, or on legacy or gw2guru, or a different website with only these files. So far only cosy has asked me to save any thread - the armor dye thread. And it will be added to my code when I'm home later. As I said, if the community decides to all go to legacy, I go too. If it's split, I'll be split. If it's all to gw2guru, I'll go there. Wherever I go I wanna be the forum historian . |
1) Challenge to all the hardcore Pre-Searing people
2) paw·ned²: All-In-One Team Build & Template Editor
3) LDoA under 17 hours
4) Death Emote
5) Borat's Guide To Pugging Correctly
6) GvG - Improvement
7) The essential reminiscing of good times 'thread'...
8) Report of a very dangerous bug
9) do you remember when
10) Players with R15?
11) Lol paragon gvg
12) Ha in crisis
13) Beginner's Guide to Guild vs. Guild Battles
14) Avatar of Lyssa
15) The Aspiring Drunkard's Guide to Binge Drinking
16) Index of Ideas - Check this thread and use SEARCH before posting ANY new thread (this pretty much needs saving of all of the threads contained therein)
17) The best guilds in game
18) Announcing GWLP
19) HA golden ages...
20) Pokemon M Aster's Ele ball on the loose
21) Leeloof got r15
22) Legendary Hero
23) Rank Fifteen
24) GvG Split Idea
25) Rank 15
26) My HA-builds
27) Guide to beating Zergway in HA
28) Rank 15
29) HoH Relic Run
30) Interrupter Bot Program, is it possible?
31) Question on Higher ranked teams
32) [Guide] Guild Wars Errors - Explanations & Solutions
33) What is the best HA-guild at the moment?
34) Interrupter guide? Tips?
35) MOST 6v6 HoH holds ???
36) Famous people and guilds in HA
37) Screenshot of the Rank 12 Emote
38) I *think* they finally fixed the droknar armor in ascalon arena exploit!
39) Hi. Read me before you post stupid monk questions.
40) You have died 0 times @ Hell's Precipice
41) guild wars: is it really all skill?
42) Top Alpha Testers Announced!
43) Happy Birthday Spooky!
44) Omnia Mutantur, Nihil Interit
And thanks in advance for this initiative of yours!
Surge goes pre
There is some real good ones there =P
http://www.guildwarsguru.com/forum/f...t10168895.html
http://www.guildwarsguru.com/forum/b...t10119932.html
These 2 have been saved.
Got my work cut out for me adding all these haha.
http://www.guildwarsguru.com/forum/f...t10168895.html
http://www.guildwarsguru.com/forum/b...t10119932.html
These 2 have been saved.
Got my work cut out for me adding all these haha.
Shasgaliel
This one please:
http://www.guildwarsguru.com/forum/g...t10488969.html
Without it some players would not be able to play at all.
http://www.guildwarsguru.com/forum/g...t10488969.html
Without it some players would not be able to play at all.
Surge goes pre
Quote:
This one please:
http://www.guildwarsguru.com/forum/g...t10488969.html Without it some players would not be able to play at all. |
Piippo
Appreciate the archiving efforts immensely, there's so much history on this site that deserves to be preserved. Off the top of my head, this thread in particular comes to mind:
http://www.guildwarsguru.com/forum/w...t10379144.html
140 pages of pure gold, gathered over 6 years, with some absolute classics in there.
http://www.guildwarsguru.com/forum/w...t10379144.html
140 pages of pure gold, gathered over 6 years, with some absolute classics in there.
bsoltan
Does anyone have advice on the best way to save a thread in this way? Is it something that anyone could do to get the threads that they want?
Cuilan
Interesting how IPB (what Guild Wars 2 Guru runs) lets you download topic pages with a simple button, but this installation of vB doesn't.
Surge goes pre
Bsoltan, I don't think anyone can do it. At least not my way. I use the 'urlread('http://www.guildwarsguru.com/..........')' matlab function and then fprint to write to file.
Kattar
Smoke Nightvogue
Quote:
Does anyone have advice on the best way to save a thread in this way? Is it something that anyone could do to get the threads that they want?
|
Quote:
Bsoltan, I don't think anyone can do it. At least not my way. I use the 'urlread('http://www.guildwarsguru.com/..........')' matlab function and then fprint to write to file.
|
bsoltan
Quote:
Here's the link to the program I'm trying to save the entire forum content with at the moment. Not sure how it all turns out in the end, though, since GWG is not simply a regular website, but a database-utilizing PHP application, which makes it much more complicated for the script when it comes to retaining the existing level of data consistency.
|
T1Cybernetic
Same here I have 11GB saved up to now but I haven't checked it over yet Seems to have done the trick though for now.
I have a feeling it's not going to be this simple but having the entire site and forums for offline viewing is awesome
I have a feeling it's not going to be this simple but having the entire site and forums for offline viewing is awesome
bsoltan
It worked! How awesome.
Kvinna
As I have just stated in the original thread, GWG will stay on in archive mode after we transition to the GW2G forums. Feel free to continue saving what you want for your personal archives, but just know that we won't be losing any of this precious history!
bsoltan
Shayne Hawke
Here are some threads to archive, in case you are still doing this:
An Open Letter to ANet
An Open Letter to ANet - Part 2
6 Months of Skill Updates: A Review
Fix Snowball
Ban during finals of monthly automated tournament
Scarred Psyche: An Extensive Guide
Rollerbeetle Racing Top Score Analysis
Why XTH should remain broken
Rollerbeetle Racing Guide : Namkey + Yuri
What's your stand on GW1? You think Anet's doing a good job?
Recent Account Bans
Petition To Demand A Response From Anet On RMT Botters and Exploiters
Ensign put out a lot of good content that should be saved.
Damage Per Second, or How I Learned to Love The Buffs
Focus Shenanigans - How To Break an Energy Denial Lock
Adrenaline - The Details
Numbers for the Cleave / Eviscerate Debate
Why Nuking Sucks
Conceptual Issue With Elementalists
June 2007 Nerf Wishlist
Where Balance Went Awry
General Balance Thoughts
A.Net Doesn't Understand Game Balance
Topics about Ursan:
PvE Balance - Part 2
Do you miss Ursan's Blessing?
Topics about Shadow Form:
Update - Thursday, May 22, 2008
SF Update will destroy GW economy
Did you make an Assassin just because of the perma SF farm?
What happened to Ecto and Shard Prices
[Dev Update] Shadow Form Balance Changes - 2 July 2008
OMG They are going to buff SF
Update - Thursday August 7
essence of celerity = NERFED
Perma is the new PvE?
UWSC going to be nerfed?
Best way to nerf UWSC?
Confirmation that the Live Team is going after SF this year
Will nerfing SF really help anything to do with the game?
UWSC Nerf and Anet's "Progress"
Is this Anet's solution to shadow form?
Shadow Form meets the end
Actually...i think SF should stay.
Preliminary Skill Update Notes: Feb 19
Update - Thursday, February 25, 2010
An Open Letter to ANet
An Open Letter to ANet - Part 2
6 Months of Skill Updates: A Review
Fix Snowball
Ban during finals of monthly automated tournament
Scarred Psyche: An Extensive Guide
Rollerbeetle Racing Top Score Analysis
Why XTH should remain broken
Rollerbeetle Racing Guide : Namkey + Yuri
What's your stand on GW1? You think Anet's doing a good job?
Recent Account Bans
Petition To Demand A Response From Anet On RMT Botters and Exploiters
Ensign put out a lot of good content that should be saved.
Damage Per Second, or How I Learned to Love The Buffs
Focus Shenanigans - How To Break an Energy Denial Lock
Adrenaline - The Details
Numbers for the Cleave / Eviscerate Debate
Why Nuking Sucks
Conceptual Issue With Elementalists
June 2007 Nerf Wishlist
Where Balance Went Awry
General Balance Thoughts
A.Net Doesn't Understand Game Balance
Topics about Ursan:
PvE Balance - Part 2
Do you miss Ursan's Blessing?
Topics about Shadow Form:
Update - Thursday, May 22, 2008
SF Update will destroy GW economy
Did you make an Assassin just because of the perma SF farm?
What happened to Ecto and Shard Prices
[Dev Update] Shadow Form Balance Changes - 2 July 2008
OMG They are going to buff SF
Update - Thursday August 7
essence of celerity = NERFED
Perma is the new PvE?
UWSC going to be nerfed?
Best way to nerf UWSC?
Confirmation that the Live Team is going after SF this year
Will nerfing SF really help anything to do with the game?
UWSC Nerf and Anet's "Progress"
Is this Anet's solution to shadow form?
Shadow Form meets the end
Actually...i think SF should stay.
Preliminary Skill Update Notes: Feb 19
Update - Thursday, February 25, 2010
Surge goes pre
As far as I'm aware, whole site is going to be archived. So I don't need to do this anymore.
Shasgaliel
I would assume it is only temporary as well. In a year or two when they see there are not many views they will pull the plug. It is safer to keep the copy of what is important in case the notice about potential closure is missed.
Age
Please transfer this one it brought me here as i was more on The Guild Hall.
http://www.guildwarsguru.com/forum/u...980&highlight=
This one to.
http://www.guildwarsguru.com/forum/5...111&highlight=
This if it is possible.
http://www.guildwarsguru.com/forum/g...247&highlight=
http://www.guildwarsguru.com/forum/u...980&highlight=
This one to.
http://www.guildwarsguru.com/forum/5...111&highlight=
This if it is possible.
http://www.guildwarsguru.com/forum/g...247&highlight=
T1Cybernetic
Those are great threads Age Worthy of saving!
Age
Please don't forget to save those threads and move them over to the GWG2.
Smoke Nightvogue
Quote:
Just as well.. turns out that httrack didn't archive everything. It might all be there but isn't easily linked together browsable from the forum.
Maybe with some fiddling. I'm trying WebCopy as well. |
What needs to be specified, is:
Forum type: Generic - vBulletin
Forums and sub-forums
All links contain this string: *-f#.html
Links to next pages contain: *-f#p#.html
Topics
All links contain this string: *-t#.html
Links to next pages contain: *-t#p#.html
And by the start of the next week, if the forum is still going to be out there, it would allow you having a personal copy for offline reading.
No worries, I'm on it as I'm writing this.