Whats on your .htaccess?

Everything related to the visual and coding aspects of websites.
User avatar
Sofia
Administrator
Posts: 981
Joined: Sun Jul 01, 2012 1:25 am
Location: Italy
Contact:

Re: Whats on your .htaccess?

Post by Sofia »

Yuzuki wrote:@Sofia
Did they remove your site completely or only blocked it? I'm wondering cause I don't plan to renew an old domain I had, but I want it removed from the WBM. I wonder what happens when the domain is no longer there and there is no robots.txt telling them to hide the site? Will it show back online?
I managed to find the email I had sent to the Internet Archive (in 2006…time really flies! o_O) so I can be more precise :) Apparently the domain was already off-line, meaning I couldn't use the robots.txt file, and the customer support told me they would exclude the domain from the Wayback Machine.

I double checked this morning and if I try to access the archived version I get the following message: “This URL has been excluded from the Wayback Machine.”. I suppose they have removed it completely.
Show me a hero and I'll write you a tragedy.
Yuzuki
Posts: 94
Joined: Sat Oct 06, 2012 11:37 pm
Contact:

Re: Whats on your .htaccess?

Post by Yuzuki »

@Mikari
Because when I started doing websites there wasn't that much privacy awareness, there is stuff I would rather see removed. Nothing too bad, but a couple of my old email addresses, etc.

@Sofia
Oh, its great to know that its stayed "removed" after all these years. I will have to look for their email and give it a try!
anon

Re: Whats on your .htaccess?

Post by anon »

Sofia wrote:Unfortunately Google is not the only one, but you can ask the Way Back Machine to exclude your site. I did it a few years ago when I asked them to remove one of my old domains from their archive. This is obviously a partial solution to the problem since it doesn't work with search engines caches.

Anyway, this discussion reminded me that I should really look into the robots.txt thing. I've also been meaning to make custom error pages for a long time (actually remake since I had them before switching to Wordpress), but preventing my site from being archived by anybody is definitely more important as I'm quite paranoid about "anonymous archiving".
You can make Google not cache your site by using your Google account to control the search caches. It's easier to maintain it if your website still exists and you have access to the server.

I too asked Wayback to stop publically archiving my websites earlier this year. They're pretty chill about it. They mention they want your mailing address and photo ID for identity purposes but I didn't have to give that stuff out when I asked them. Just so as long as they have a way of figuring out you were the original webmaster it should be fine. In my case I e-mailed them with an old e-mail account that was listed on my old site.

This code on robots.txt will disallow crawls from every bot.
User-agent: *
Disallow: /
User avatar
Sofia
Administrator
Posts: 981
Joined: Sun Jul 01, 2012 1:25 am
Location: Italy
Contact:

Re: Whats on your .htaccess?

Post by Sofia »

Mikari wrote:Is there another downside to it beyond the part about an undated version continuing to exist? I haven't looked into the robot thing yet either.
In my case it was an "anti-copycat" measure :/ The former co-owner of that domain kept re-using my stuff (both layouts and content I designed/wrote) for her own domain. Removing the domain solved the problem (she wasn't so clever to save everything on her computer…apparently she only used the Wayback Machine when she needed some "inspiration" XD).
Yuzuki wrote:Oh, its great to know that its stayed "removed" after all these years. I will have to look for their email and give it a try!
Yep…it wasn't nice to see someone, moreover someone I knew (or rather I thought to know) steal from me. If something similar happened again I would have troubles proving the theft since I no longer have the original files.
Joe wrote:I too asked Wayback to stop publically archiving my websites earlier this year. They're pretty chill about it. They mention they want your mailing address and photo ID for identity purposes but I didn't have to give that stuff out when I asked them. Just so as long as they have a way of figuring out you were the original webmaster it should be fine. In my case I e-mailed them with an old e-mail account that was listed on my old site.
It seems that now they are being slightly more strict than a few years ago, I only sent them the email without providing any other information. It makes sense, though. Otherwise someone could ask to remove sites they didn't own.
Show me a hero and I'll write you a tragedy.
Eden
Posts: 314
Joined: Wed Jul 18, 2012 1:36 pm
Location: Ponyville
Contact:

Re: Whats on your .htaccess?

Post by Eden »

I must be the only one who doesn't care that the WBM archives my sites. I really liked that they do that because I was able to grab a huge member list from a fanlisting I owned many years ago but lost. I also get to show people what some of my designs used to look like way back.
Not just another romantic comedy. ~ Get Dropbox!
anon

Re: Whats on your .htaccess?

Post by anon »

Eden wrote:I must be the only one who doesn't care that the WBM archives my sites. I really liked that they do that because I was able to grab a huge member list from a fanlisting I owned many years ago but lost. I also get to show people what some of my designs used to look like way back.
No, you are definitely not the only one. Each people has their own reasons of why they don't want their sites archived but obviously there are people who don't mind.
Mikari
Posts: 3159
Joined: Thu Jun 21, 2012 6:30 pm
Location: Coruscant
Contact:

Re: Whats on your .htaccess?

Post by Mikari »

Eden wrote:I must be the only one who doesn't care that the WBM archives my sites. I really liked that they do that because I was able to grab a huge member list from a fanlisting I owned many years ago but lost. I also get to show people what some of my designs used to look like way back.
I actually like it and I'm amused by it. In fact, I accessed all my sites from the Way Back to see if they were there and those that were not apparently got set up to be saved or something because it gave me the archive 1 minute ago thing.
Post Reply