c++.chat - interesting spam trap

roland (2/2) Jun 02 2003 http://www.unclebobsuncle.com/antispam.html

Greg Peet (4/6) Jun 02 2003 Wow thanks for bringing that to our attention. I can't wait to put that ...
Jan Knepper (63/65) Jun 02 2003 Interesting indeed, but it does not work. Besides most of the

KarL (3/14) Jun 02 2003 And are you run sendmail or qmail or postfix?

Jan Knepper (3/17) Jun 03 2003 Definitely not sendmail...

Walter (4/13) Jun 02 2003 I use a javascript generated mailto: on the digitalmars web pages. Are t...

Jan Knepper (3/16) Jun 03 2003 Yes! My crawler will pick those up with out ANY problem.

Walter (4/8) Jun 03 2003 the

Jan Knepper (2/10) Jun 03 2003 No, I can provide you with that, if you want...

roland (5/89) Jun 03 2003 hello
roland (24/108) Jun 04 2003 hi

Jan Knepper (70/85) Jun 05 2003 500 websites (pages) in a webring would take a decent crawler no more th...

roland (6/112) Jun 05 2003 ok

Jan Knepper (5/116) Jun 05 2003 ;-)

roland (3/7) Jun 06 2003 oops 8-(

Greg Peet (4/6) Jun 06 2003 a) Logic, b) Didn't Nostradamus say something about

roland (6/16) Jun 06 2003 lets talk something else .. spam are not so bad after all

Scott Dale Robison (21/31) Jun 07 2003 I agree with 99.99% of what you wrote, this being the one part I

gf (4/27) Jun 07 2003 You sure fooled me! :)))))
Jan Knepper (10/40) Jun 07 2003 I know... I have experienced that as well.

Scott Dale Robison (17/23) Jun 07 2003 I've never heard complaints from a user of SpamCop, to be fair. Only a

Jan Knepper (13/29) Jun 09 2003 Oh, I have seen those complaints MANY times. People that actually opted-...

Scott Dale Robison (4/6) Jun 09 2003 I think I was running Xmail at the time, though I'm not 100% certain.

roland <--nancyetroland free.fr> writes:

http://www.unclebobsuncle.com/antispam.html

roland :-)

Jun 02 2003

"Greg Peet" <admin gregpeet.com> writes:

Wow thanks for bringing that to our attention. I can't wait to put that on
my site. Quite funny too.

"roland" <--nancyetroland free.fr> wrote in message
news:bbgc75$2339$1 digitaldaemon.com...
 http://www.unclebobsuncle.com/antispam.html

 roland :-)

Jun 02 2003

Jan Knepper <jan smartsoft.us> writes:

Interesting indeed, but it does not work. Besides most of the
statements on the page have no ground.

First of all, any decent spider or crawler would keep track of
URL's it has processed. I mean think about it, every decent
website probable has circular references in the form of x.html
-> y.html -> z.html -> x.html. I know for a fact that quite a
few of my sites have many of these. Obviously this is something
anyone developing a spider or crawler, which I have done ;-),
will run into. So the idea is cute, but I don't think it really
works.

Second, quite a bit of the page is generated through JavaScript.
Many spiders or crawlers do NOT run JavaScript. I know for a
fact that JavaScript is a serious challenge for many of the
search engines on the internet.

Third, some, more advanced spiders or crawlers do not just look
at mailto: tags, but recorgnize a ' ' and check the prefix and
suffix. Run the complete string through an email syntax checker,
to make sure the address only contains legal email address
characters and such and actually ends with an existing Top Level
Domain (TLD) such as .com, .net. .com, etc and later match check
the domain through DNS and/or Whois.

Fourth, the invalid email addresses have no effect on spammers.
They will burn some more bandwidth, but as they usually use
non-existent From: and Return-Path: in their messages anyone,
but not the spammer will receive the bounces.

Fifth, if the spammer would actually have some form of decency
and bulk mail to a list and honor a removal mechanism the
mechanism usually is intelligent enough to keep track of
bounces, probe them and next remove them from the list
automagically. Check here for instance http://www.ezmlm.org/
which works with MySQL http://www.mysql.com/ through which it is
rather easy to maintain a database with millions of email
addresses.

To actually *fight* SPAM what would make sence is report SPAM
ASAP at http://www.spamcop.net/ as that results into more than
just reporting. One of the great features is that once a lot
people start reporting a certain SPAM spamcop will at the
originating IP address to bl.spamcop.net which can be used by
email receiving servers (SMTP servers) to block incoming email
if it comes from one of the many blocked IP addresses.
Unfortunately, most people just seem to delete SPAM and most
email providers do not seem to use bl.spamcop.net for email
blocking.

Of course, not publishing you email address ANYWHERE on the
internet would help the most! ;-) However, I have noticed that
quite a few company's that collect email addresses with online
sales or other forms of subscription also sell those email
addresses to others...

Just my 2 cents... Oh, in case there is any doubt... ;-) I have
written a couple of crawlers and actually also crawlers that do
handle JavaScripts very well. I have been hosting Internet
services for 3 years. I do report almost all spam at
http://www.spamcop.net/ and yes, the mail servers here do check
bl.spamcop.net (and a few others) before they actually receive
the email, well that is if the domain owners want it. Check
http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
for some statistics on SPAM blocking...
Recently I patched the SMTP server again so it does block all
non-existent email adresses on local domains.



roland wrote:

 http://www.unclebobsuncle.com/antispam.html

 roland :-)

--
ManiaC++
Jan Knepper

Jun 02 2003

"KarL" <someone somewhere.org> writes:

And are you run sendmail or qmail or postfix?

"Jan Knepper" <jan smartsoft.us> wrote in message
news:3EDBFCDB.9D953019 smartsoft.us...

 Just my 2 cents... Oh, in case there is any doubt... ;-) I have
 written a couple of crawlers and actually also crawlers that do
 handle JavaScripts very well. I have been hosting Internet
 services for 3 years. I do report almost all spam at
 http://www.spamcop.net/ and yes, the mail servers here do check
 bl.spamcop.net (and a few others) before they actually receive
 the email, well that is if the domain owners want it. Check
 http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
 for some statistics on SPAM blocking...
 Recently I patched the SMTP server again so it does block all
 non-existent email adresses on local domains.

Jun 02 2003

Jan Knepper <jan smartsoft.us> writes:

Definitely not sendmail...
Patched qmail...



KarL wrote:

 And are you run sendmail or qmail or postfix?

 "Jan Knepper" <jan smartsoft.us> wrote in message
 news:3EDBFCDB.9D953019 smartsoft.us...

 Just my 2 cents... Oh, in case there is any doubt... ;-) I have
 written a couple of crawlers and actually also crawlers that do
 handle JavaScripts very well. I have been hosting Internet
 services for 3 years. I do report almost all spam at
 http://www.spamcop.net/ and yes, the mail servers here do check
 bl.spamcop.net (and a few others) before they actually receive
 the email, well that is if the domain owners want it. Check
 http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
 for some statistics on SPAM blocking...
 Recently I patched the SMTP server again so it does block all
 non-existent email adresses on local domains.

Jun 03 2003

"Walter" <walter digitalmars.com> writes:

"Jan Knepper" <jan smartsoft.us> wrote in message
news:3EDBFCDB.9D953019 smartsoft.us...
 Just my 2 cents... Oh, in case there is any doubt... ;-) I have
 written a couple of crawlers and actually also crawlers that do
 handle JavaScripts very well. I have been hosting Internet
 services for 3 years. I do report almost all spam at
 http://www.spamcop.net/ and yes, the mail servers here do check
 bl.spamcop.net (and a few others) before they actually receive
 the email, well that is if the domain owners want it. Check
 http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
 for some statistics on SPAM blocking...

I use a javascript generated mailto: on the digitalmars web pages. Are the
javascript aware scrapers able to figure those out?

Jun 02 2003

Jan Knepper <jan smartsoft.us> writes:

Walter wrote:

 "Jan Knepper" <jan smartsoft.us> wrote in message
 news:3EDBFCDB.9D953019 smartsoft.us...
 Just my 2 cents... Oh, in case there is any doubt... ;-) I have
 written a couple of crawlers and actually also crawlers that do
 handle JavaScripts very well. I have been hosting Internet
 services for 3 years. I do report almost all spam at
 http://www.spamcop.net/ and yes, the mail servers here do check
 bl.spamcop.net (and a few others) before they actually receive
 the email, well that is if the domain owners want it. Check
 http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
 for some statistics on SPAM blocking...

 I use a javascript generated mailto: on the digitalmars web pages. Are the
 javascript aware scrapers able to figure those out?

Yes! My crawler will pick those up with out ANY problem.

Jan

Jun 03 2003

"Walter" <walter digitalmars.com> writes:

"Jan Knepper" <jan smartsoft.us> wrote in message
news:3EDC8F17.9616A80A smartsoft.us...
 Walter wrote:
 I use a javascript generated mailto: on the digitalmars web pages. Are


the
 javascript aware scrapers able to figure those out?

 Yes! My crawler will pick those up with out ANY problem.

Does that mean I have to write a cgi program to do it? <g>

Jun 03 2003

Jan Knepper <jan smartsoft.us> writes:

Walter wrote:

 "Jan Knepper" <jan smartsoft.us> wrote in message
 news:3EDC8F17.9616A80A smartsoft.us...
 Walter wrote:
 I use a javascript generated mailto: on the digitalmars web pages. Are


 the
 javascript aware scrapers able to figure those out?

 Yes! My crawler will pick those up with out ANY problem.

 Does that mean I have to write a cgi program to do it? <g>

No, I can provide you with that, if you want...

Jun 03 2003

roland <--rv ronetech.com> writes:

hello

thanks for the interesting information

cheers

roland

Jan Knepper wrote:

 Interesting indeed, but it does not work. Besides most of the
 statements on the page have no ground.
 
 First of all, any decent spider or crawler would keep track of
 URL's it has processed. I mean think about it, every decent
 website probable has circular references in the form of x.html
 -> y.html -> z.html -> x.html. I know for a fact that quite a
 few of my sites have many of these. Obviously this is something
 anyone developing a spider or crawler, which I have done ;-),
 will run into. So the idea is cute, but I don't think it really
 works.
 
 Second, quite a bit of the page is generated through JavaScript.
 Many spiders or crawlers do NOT run JavaScript. I know for a
 fact that JavaScript is a serious challenge for many of the
 search engines on the internet.
 
 Third, some, more advanced spiders or crawlers do not just look
 at mailto: tags, but recorgnize a ' ' and check the prefix and
 suffix. Run the complete string through an email syntax checker,
 to make sure the address only contains legal email address
 characters and such and actually ends with an existing Top Level
 Domain (TLD) such as .com, .net. .com, etc and later match check
 the domain through DNS and/or Whois.
 
 Fourth, the invalid email addresses have no effect on spammers.
 They will burn some more bandwidth, but as they usually use
 non-existent From: and Return-Path: in their messages anyone,
 but not the spammer will receive the bounces.
 
 Fifth, if the spammer would actually have some form of decency
 and bulk mail to a list and honor a removal mechanism the
 mechanism usually is intelligent enough to keep track of
 bounces, probe them and next remove them from the list
 automagically. Check here for instance http://www.ezmlm.org/
 which works with MySQL http://www.mysql.com/ through which it is
 rather easy to maintain a database with millions of email
 addresses.
 
 To actually *fight* SPAM what would make sence is report SPAM
 ASAP at http://www.spamcop.net/ as that results into more than
 just reporting. One of the great features is that once a lot
 people start reporting a certain SPAM spamcop will at the
 originating IP address to bl.spamcop.net which can be used by
 email receiving servers (SMTP servers) to block incoming email
 if it comes from one of the many blocked IP addresses.
 Unfortunately, most people just seem to delete SPAM and most
 email providers do not seem to use bl.spamcop.net for email
 blocking.
 
 Of course, not publishing you email address ANYWHERE on the
 internet would help the most! ;-) However, I have noticed that
 quite a few company's that collect email addresses with online
 sales or other forms of subscription also sell those email
 addresses to others...
 
 Just my 2 cents... Oh, in case there is any doubt... ;-) I have
 written a couple of crawlers and actually also crawlers that do
 handle JavaScripts very well. I have been hosting Internet
 services for 3 years. I do report almost all spam at
 http://www.spamcop.net/ and yes, the mail servers here do check
 bl.spamcop.net (and a few others) before they actually receive
 the email, well that is if the domain owners want it. Check
 http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
 for some statistics on SPAM blocking...
 Recently I patched the SMTP server again so it does block all
 non-existent email adresses on local domains.
 
 
 
 roland wrote:
 
 
http://www.unclebobsuncle.com/antispam.html

roland :-)

 
 --
 ManiaC++
 Jan Knepper

Jun 03 2003

roland <--nancyetroland free.fr> writes:

Jan Knepper wrote:
 Interesting indeed, but it does not work. Besides most of the
 statements on the page have no ground.
 
 First of all, any decent spider or crawler would keep track of
 URL's it has processed. I mean think about it, every decent
 website probable has circular references in the form of x.html
 -> y.html -> z.html -> x.html. I know for a fact that quite a
 few of my sites have many of these. Obviously this is something
 anyone developing a spider or crawler, which I have done ;-),
 will run into. So the idea is cute, but I don't think it really
 works.
 
 Second, quite a bit of the page is generated through JavaScript.
 Many spiders or crawlers do NOT run JavaScript. I know for a
 fact that JavaScript is a serious challenge for many of the
 search engines on the internet.
 
 Third, some, more advanced spiders or crawlers do not just look
 at mailto: tags, but recorgnize a ' ' and check the prefix and
 suffix. Run the complete string through an email syntax checker,
 to make sure the address only contains legal email address
 characters and such and actually ends with an existing Top Level
 Domain (TLD) such as .com, .net. .com, etc and later match check
 the domain through DNS and/or Whois.
 
 Fourth, the invalid email addresses have no effect on spammers.
 They will burn some more bandwidth, but as they usually use
 non-existent From: and Return-Path: in their messages anyone,
 but not the spammer will receive the bounces.
 
 Fifth, if the spammer would actually have some form of decency
 and bulk mail to a list and honor a removal mechanism the
 mechanism usually is intelligent enough to keep track of
 bounces, probe them and next remove them from the list
 automagically. Check here for instance http://www.ezmlm.org/
 which works with MySQL http://www.mysql.com/ through which it is
 rather easy to maintain a database with millions of email
 addresses.
 
 To actually *fight* SPAM what would make sence is report SPAM
 ASAP at http://www.spamcop.net/ as that results into more than
 just reporting. One of the great features is that once a lot
 people start reporting a certain SPAM spamcop will at the
 originating IP address to bl.spamcop.net which can be used by
 email receiving servers (SMTP servers) to block incoming email
 if it comes from one of the many blocked IP addresses.
 Unfortunately, most people just seem to delete SPAM and most
 email providers do not seem to use bl.spamcop.net for email
 blocking.
 
 Of course, not publishing you email address ANYWHERE on the
 internet would help the most! ;-) However, I have noticed that
 quite a few company's that collect email addresses with online
 sales or other forms of subscription also sell those email
 addresses to others...
 
 Just my 2 cents... Oh, in case there is any doubt... ;-) I have
 written a couple of crawlers and actually also crawlers that do
 handle JavaScripts very well. I have been hosting Internet
 services for 3 years. I do report almost all spam at
 http://www.spamcop.net/ and yes, the mail servers here do check
 bl.spamcop.net (and a few others) before they actually receive
 the email, well that is if the domain owners want it. Check
 http://www.digitaldaemon.com/Internet%20Services/rblsmtpd.shtml
 for some statistics on SPAM blocking...
 Recently I patched the SMTP server again so it does block all
 non-existent email adresses on local domains.
 
 
 
 roland wrote:
 
 
http://www.unclebobsuncle.com/antispam.html

roland :-)

 
 
 --
 ManiaC++
 Jan Knepper
 
 

hi

jan: an opinion on that ?

<<

yep, thats the reason why i suggested a webring of spamtraps would do 
better and the addresses be generated from a wide list of word 
combination. just imagine the how many combination could be done with 
this set of data

rule: [ a | a+b | a+b+c | a+c | ... | b+a ] +   + [ domain ].[level]

where:

a, b ,c ..: this, that, free, sun, ram, bot, mail, fish, stick, 33, big, 
flower
domain : big, stick, homer, biz, temp, duch, pleht
level : com, biz, net, org, mil


the list could be customized per each website. i dont see how the 
crawler could take all those words into consideration. they can remove 
the invalid mails when it bounce but i think the one we are discussing 
right now will guarantee that they will have an adequate supply for a 
very long time. imagine a webring of 500 sites linking one another.

ciao!
_________________
You have read a post from a newbie. Take everything with a grain of salt.



The user formerly known as ramfree17 (oh,im still ramfree17 ?!?!)

Jun 04 2003

Jan Knepper <jan smartsoft.us> writes:

roland wrote:

 yep, thats the reason why i suggested a webring of spamtraps would do
 better and the addresses be generated from a wide list of word
 combination. just imagine the how many combination could be done with
 this set of data

 rule: [ a | a+b | a+b+c | a+c | ... | b+a ] +   + [ domain ].[level]

 where:

 a, b ,c ..: this, that, free, sun, ram, bot, mail, fish, stick, 33, big,
 flower
 domain : big, stick, homer, biz, temp, duch, pleht
 level : com, biz, net, org, mil

 the list could be customized per each website. i dont see how the
 crawler could take all those words into consideration. they can remove
 the invalid mails when it bounce but i think the one we are discussing
 right now will guarantee that they will have an adequate supply for a
 very long time. imagine a webring of 500 sites linking one another.

500 websites (pages) in a webring would take a decent crawler no more than 2
hours to process. Believe me, they are NOT using DSL or Cable!!!
Serial processing of 500 web pages   10 seconds per page (boy is that long!)
is 5000 seconds, that's not more than 2 hours! Than they match what ever they
found against a local DNS server with enough cache. Try:

If you have a Unix/BSD/Linux box one line somewhere:



; <<>> DiG 8.3 <<>> mx free.fr
;; res options: init recurs defnam dnsrch
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 2, ADDITIONAL: 12
;; QUERY SECTION:
;;      free.fr, type = MX, class = IN

;; ANSWER SECTION:
free.fr.                1D IN MX        10 mx.free.fr.
free.fr.                1D IN MX        20 mrelay2-1.free.fr.
free.fr.                1D IN MX        20 mrelay2-2.free.fr.
free.fr.                1D IN MX        20 mx1-1.free.fr.
free.fr.                1D IN MX        50 mrelay3-2.free.fr.
free.fr.                1D IN MX        50 mrelay4-2.free.fr.
free.fr.                1D IN MX        50 mrelay1-1.free.fr.
free.fr.                1D IN MX        50 mrelay1-2.free.fr.
free.fr.                1D IN MX        60 mrelay3-1.free.fr.
free.fr.                1D IN MX        90 ns1.proxad.net.

;; AUTHORITY SECTION:
free.fr.                1D IN NS        ns0.proxad.net.
free.fr.                1D IN NS        ns1.proxad.net.

;; ADDITIONAL SECTION:
mx.free.fr.             15M IN A        213.228.0.1
mx.free.fr.             15M IN A        213.228.0.129
mx.free.fr.             15M IN A        213.228.0.13
mx.free.fr.             15M IN A        213.228.0.131
mx.free.fr.             15M IN A        213.228.0.166
mx.free.fr.             15M IN A        213.228.0.175
mx.free.fr.             15M IN A        213.228.0.65
mrelay2-1.free.fr.      1D IN A         213.228.0.13
mrelay2-2.free.fr.      1D IN A         213.228.0.131
mx1-1.free.fr.          1D IN A         213.228.0.65
mrelay3-2.free.fr.      1D IN A         213.228.0.166
mrelay4-2.free.fr.      1D IN A         213.228.0.175

;; Total query time: 127 msec
;; FROM: digitaldaemon.com to SERVER: default -- 63.105.9.35
;; WHEN: Thu Jun  5 10:04:24 2003
;; MSG SIZE  sent: 25  rcvd: 502

This is done with the 'dig' progam total query time is 127 msec's!!!!
Now they know whether or not the found domain actually has a MX record, or
not... If not, they can just drop the address from the list.

Also crawlers do *not* browse the web like we do. They just process 'text'
oriented files and run several (read hundreds or thousands) of
threads/processes at the same time.

So, the only thing you would be able to actually make a difference with is
using existing domain names. Not a good idea as owners of those domains might
have a catch all and than receive the same SPAM over and over again. Soon,
the providers will all change their systems so their SMTP servers only accept
email to addresses that actually do exist and *deny* receipt of anything else
with the usual 550 error.

So, in the end, what are we actually creating with stuff like this???
Nothing, we just have crawlers/spiders consume more bandwidth to read all the
pages. The crawlers' DNS matcher consume more bandwidth to check for DNS. The
bulk mailer consume more bandwidth to send all the email. The internet
consume more bandwidth to deal with all the bounces, double bounces, etc.

Last, 500 pages with each a 1,000 email addresses is 500,000 email addresses.
I hate to tell you, but that's only 1.5% of the total email addresses I
have... <sigh> Would you honestly think that anyone would process the bounces
for numbers like that manually???

ManiaC++
Jan Knepper

Jun 05 2003

roland <--nancyetroland free.fr> writes:

Jan Knepper wrote:
 roland wrote:
 
 yep, thats the reason why i suggested a webring of spamtraps would do
 better and the addresses be generated from a wide list of word
 combination. just imagine the how many combination could be done with
 this set of data

 rule: [ a | a+b | a+b+c | a+c | ... | b+a ] +   + [ domain ].[level]

 where:

 a, b ,c ..: this, that, free, sun, ram, bot, mail, fish, stick, 33, big,
 flower
 domain : big, stick, homer, biz, temp, duch, pleht
 level : com, biz, net, org, mil

 the list could be customized per each website. i dont see how the
 crawler could take all those words into consideration. they can remove
 the invalid mails when it bounce but i think the one we are discussing
 right now will guarantee that they will have an adequate supply for a
 very long time. imagine a webring of 500 sites linking one another.

 500 websites (pages) in a webring would take a decent crawler no more 
 than 2 hours to process. Believe me, they are NOT using DSL or Cable!!!
 Serial processing of 500 web pages   10 seconds per page (boy is that 
 long!) is 5000 seconds, that's not more than 2 hours! Than they match 
 what ever they found against a local DNS server with enough cache. Try:

 If you have a Unix/BSD/Linux box one line somewhere:
 

 
 ; <<>> DiG 8.3 <<>> mx free.fr
 ;; res options: init recurs defnam dnsrch
 ;; got answer:
 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2
 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 2, ADDITIONAL: 12
 ;; QUERY SECTION:
 ;;      free.fr, type = MX, class = IN
 
 ;; ANSWER SECTION:
 free.fr.                1D IN MX        10 mx.free.fr.
 free.fr.                1D IN MX        20 mrelay2-1.free.fr.
 free.fr.                1D IN MX        20 mrelay2-2.free.fr.
 free.fr.                1D IN MX        20 mx1-1.free.fr.
 free.fr.                1D IN MX        50 mrelay3-2.free.fr.
 free.fr.                1D IN MX        50 mrelay4-2.free.fr.
 free.fr.                1D IN MX        50 mrelay1-1.free.fr.
 free.fr.                1D IN MX        50 mrelay1-2.free.fr.
 free.fr.                1D IN MX        60 mrelay3-1.free.fr.
 free.fr.                1D IN MX        90 ns1.proxad.net.
 
 ;; AUTHORITY SECTION:
 free.fr.                1D IN NS        ns0.proxad.net.
 free.fr.                1D IN NS        ns1.proxad.net.
 
 ;; ADDITIONAL SECTION:
 mx.free.fr.             15M IN A        213.228.0.1
 mx.free.fr.             15M IN A        213.228.0.129
 mx.free.fr.             15M IN A        213.228.0.13
 mx.free.fr.             15M IN A        213.228.0.131
 mx.free.fr.             15M IN A        213.228.0.166
 mx.free.fr.             15M IN A        213.228.0.175
 mx.free.fr.             15M IN A        213.228.0.65
 mrelay2-1.free.fr.      1D IN A         213.228.0.13
 mrelay2-2.free.fr.      1D IN A         213.228.0.131
 mx1-1.free.fr.          1D IN A         213.228.0.65
 mrelay3-2.free.fr.      1D IN A         213.228.0.166
 mrelay4-2.free.fr.      1D IN A         213.228.0.175
 
 ;; Total query time: 127 msec
 ;; FROM: digitaldaemon.com to SERVER: default -- 63.105.9.35
 ;; WHEN: Thu Jun  5 10:04:24 2003
 ;; MSG SIZE  sent: 25  rcvd: 502
 
 This is done with the 'dig' progam total query time is 127 msec's!!!!
 Now they know whether or not the found domain actually has a MX record, 
 or not... If not, they can just drop the address from the list.
 
 Also crawlers do *not* browse the web like we do. They just process 
 'text' oriented files and run several (read hundreds or thousands) of 
 threads/processes at the same time.
 
 So, the only thing you would be able to actually make a difference with 
 is using existing domain names. Not a good idea as owners of those 
 domains might have a catch all and than receive the same SPAM over and 
 over again. Soon, the providers will all change their systems so their 
 SMTP servers only accept email to addresses that actually do exist and 
 *deny* receipt of anything else with the usual 550 error.
 
 So, in the end, what are we actually creating with stuff like this??? 
 Nothing, we just have crawlers/spiders consume more bandwidth to read 
 all the pages. The crawlers' DNS matcher consume more bandwidth to check 
 for DNS. The bulk mailer consume more bandwidth to send all the email. 
 The internet consume more bandwidth to deal with all the bounces, double 
 bounces, etc.
 
 Last, 500 pages with each a 1,000 email addresses is 500,000 email 
 addresses. I hate to tell you, but that's only 1.5% of the total email 
 addresses I have... <sigh> Would you honestly think that anyone would 
 process the bounces for numbers like that manually???
 
 ManiaC++
 Jan Knepper
  
 

ok
i'm afraid i'm consuming _your_ bandwidth .. ;-)
a last question: what happen a) to the crawlers, b) to the internet, if 
100000 sites have 10000 (=10e9) e-mail addresse ?

roland

Jun 05 2003

Jan Knepper <jan smartsoft.us> writes:

roland wrote:

 Jan Knepper wrote:
 roland wrote:

 yep, thats the reason why i suggested a webring of spamtraps would do
 better and the addresses be generated from a wide list of word
 combination. just imagine the how many combination could be done with
 this set of data

 rule: [ a | a+b | a+b+c | a+c | ... | b+a ] +   + [ domain ].[level]

 where:

 a, b ,c ..: this, that, free, sun, ram, bot, mail, fish, stick, 33, big,
 flower
 domain : big, stick, homer, biz, temp, duch, pleht
 level : com, biz, net, org, mil

 the list could be customized per each website. i dont see how the
 crawler could take all those words into consideration. they can remove
 the invalid mails when it bounce but i think the one we are discussing
 right now will guarantee that they will have an adequate supply for a
 very long time. imagine a webring of 500 sites linking one another.

 500 websites (pages) in a webring would take a decent crawler no more
 than 2 hours to process. Believe me, they are NOT using DSL or Cable!!!
 Serial processing of 500 web pages   10 seconds per page (boy is that
 long!) is 5000 seconds, that's not more than 2 hours! Than they match
 what ever they found against a local DNS server with enough cache. Try:

 If you have a Unix/BSD/Linux box one line somewhere:



 ; <<>> DiG 8.3 <<>> mx free.fr
 ;; res options: init recurs defnam dnsrch
 ;; got answer:
 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2
 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 2, ADDITIONAL: 12
 ;; QUERY SECTION:
 ;;      free.fr, type = MX, class = IN

 ;; ANSWER SECTION:
 free.fr.                1D IN MX        10 mx.free.fr.
 free.fr.                1D IN MX        20 mrelay2-1.free.fr.
 free.fr.                1D IN MX        20 mrelay2-2.free.fr.
 free.fr.                1D IN MX        20 mx1-1.free.fr.
 free.fr.                1D IN MX        50 mrelay3-2.free.fr.
 free.fr.                1D IN MX        50 mrelay4-2.free.fr.
 free.fr.                1D IN MX        50 mrelay1-1.free.fr.
 free.fr.                1D IN MX        50 mrelay1-2.free.fr.
 free.fr.                1D IN MX        60 mrelay3-1.free.fr.
 free.fr.                1D IN MX        90 ns1.proxad.net.

 ;; AUTHORITY SECTION:
 free.fr.                1D IN NS        ns0.proxad.net.
 free.fr.                1D IN NS        ns1.proxad.net.

 ;; ADDITIONAL SECTION:
 mx.free.fr.             15M IN A        213.228.0.1
 mx.free.fr.             15M IN A        213.228.0.129
 mx.free.fr.             15M IN A        213.228.0.13
 mx.free.fr.             15M IN A        213.228.0.131
 mx.free.fr.             15M IN A        213.228.0.166
 mx.free.fr.             15M IN A        213.228.0.175
 mx.free.fr.             15M IN A        213.228.0.65
 mrelay2-1.free.fr.      1D IN A         213.228.0.13
 mrelay2-2.free.fr.      1D IN A         213.228.0.131
 mx1-1.free.fr.          1D IN A         213.228.0.65
 mrelay3-2.free.fr.      1D IN A         213.228.0.166
 mrelay4-2.free.fr.      1D IN A         213.228.0.175

 ;; Total query time: 127 msec
 ;; FROM: digitaldaemon.com to SERVER: default -- 63.105.9.35
 ;; WHEN: Thu Jun  5 10:04:24 2003
 ;; MSG SIZE  sent: 25  rcvd: 502

 This is done with the 'dig' progam total query time is 127 msec's!!!!
 Now they know whether or not the found domain actually has a MX record,
 or not... If not, they can just drop the address from the list.

 Also crawlers do *not* browse the web like we do. They just process
 'text' oriented files and run several (read hundreds or thousands) of
 threads/processes at the same time.

 So, the only thing you would be able to actually make a difference with
 is using existing domain names. Not a good idea as owners of those
 domains might have a catch all and than receive the same SPAM over and
 over again. Soon, the providers will all change their systems so their
 SMTP servers only accept email to addresses that actually do exist and
 *deny* receipt of anything else with the usual 550 error.

 So, in the end, what are we actually creating with stuff like this???
 Nothing, we just have crawlers/spiders consume more bandwidth to read
 all the pages. The crawlers' DNS matcher consume more bandwidth to check
 for DNS. The bulk mailer consume more bandwidth to send all the email.
 The internet consume more bandwidth to deal with all the bounces, double
 bounces, etc.

 Last, 500 pages with each a 1,000 email addresses is 500,000 email
 addresses. I hate to tell you, but that's only 1.5% of the total email
 addresses I have... <sigh> Would you honestly think that anyone would
 process the bounces for numbers like that manually???

 ManiaC++
 Jan Knepper

 ok
 i'm afraid i'm consuming _your_ bandwidth .. ;-)

Don't worry.

 a last question: what happen a) to the crawlers, b) to the internet, if
 100000 sites have 10000 (=10e9) e-mail addresse ?

;-)
Internet Meltdown...

Jan

Jun 05 2003

roland <--rv ronetech.com> writes:

Jan Knepper wrote:

 Internet Meltdown...
 
 Jan
 

oops  8-(

roland

Jun 06 2003

"Greg Peet" <admin gregpeet.com> writes:

"roland" wrote:
 a last question: what happen a) to the crawlers, b) to the internet, if
 100000 sites have 10000 (=10e9) e-mail addresse ?

a) Logic, b) Didn't Nostradamus say something about
this...hmm...armageddon...bill gates...something around those lines i think
=P

Jun 06 2003

roland <--rv ronetech.com> writes:

Greg Peet wrote:

 "roland" wrote:
 
a last question: what happen a) to the crawlers, b) to the internet, if
100000 sites have 10000 (=10e9) e-mail addresse ?

 
 a) Logic, b) Didn't Nostradamus say something about
 this...hmm...armageddon...bill gates...something around those lines i think
 =P
 

lets talk something else .. spam are not so bad after all

you can buy a master degree without studying, improve sexual 
satisfaction, earn thousan of cash without working ... ;-)

by

roland

Jun 06 2003

Scott Dale Robison <scott-news.digitalmars.com isdr.net> writes:

Jan Knepper wrote:
 To actually *fight* SPAM what would make sence is report SPAM
 ASAP at http://www.spamcop.net/ as that results into more than
 just reporting. One of the great features is that once a lot
 people start reporting a certain SPAM spamcop will at the
 originating IP address to bl.spamcop.net which can be used by
 email receiving servers (SMTP servers) to block incoming email
 if it comes from one of the many blocked IP addresses.
 Unfortunately, most people just seem to delete SPAM and most
 email providers do not seem to use bl.spamcop.net for email
 blocking.

I agree with 99.99% of what you wrote, this being the one part I 
(partially) disagree with. Sure, SpamCop (and other similar services) 
can prove valuable, but they have some serious potential downfalls. The 
single biggest one, IMO, is that many spam-blocking services don't care 
about the source of an email. If it is reported as spam, they have no 
obligation to confirm it. I personally know of cases where actual 
*documentation* of a persons opt-in was completely and utterly ignored. 
The person in question didn't bother trying to opt-out (note: after 
having opt'ed-in), they just reported the 'spam' to SpamCop and the 
'offending' mail server was black-holed. Note: I realize this is just my 
word against their's, and I don't expect anyone to just assume I'm 
right. I'm just sharing a personal experience and it's worth exactly 
what you're paying for it.

I guess the point I'm trying to make is, if you want to use SpamCop or 
any other similar service, feel free. Just realize that these entities 
are no more regulated than the spammers they claim to want to stop, and 
sometimes an agenda may slip through. After all, their value is in 
blocking email. So what if sometimes legitimate email gets blocked?

No, I'm not a spammer. Just a person with opinions. :)

Scott Dale Robison

Jun 07 2003

gf <mz_y2k yahoo...com> writes:

Scott Dale Robison <scott-news.digitalmars.com isdr.net> wrote in 
news:bbsff6$1nvl$1 digitaldaemon.com:

 I agree with 99.99% of what you wrote, this being the one part I 
 (partially) disagree with. Sure, SpamCop (and other similar services) 
 can prove valuable, but they have some serious potential downfalls. The 
 single biggest one, IMO, is that many spam-blocking services don't care 
 about the source of an email. If it is reported as spam, they have no 
 obligation to confirm it. I personally know of cases where actual 
 *documentation* of a persons opt-in was completely and utterly ignored. 
 The person in question didn't bother trying to opt-out (note: after 
 having opt'ed-in), they just reported the 'spam' to SpamCop and the 
 'offending' mail server was black-holed. Note: I realize this is just my 
 word against their's, and I don't expect anyone to just assume I'm 
 right. I'm just sharing a personal experience and it's worth exactly 
 what you're paying for it.
 
 I guess the point I'm trying to make is, if you want to use SpamCop or 
 any other similar service, feel free. Just realize that these entities 
 are no more regulated than the spammers they claim to want to stop, and 
 sometimes an agenda may slip through. After all, their value is in 
 blocking email. So what if sometimes legitimate email gets blocked?
 
 No, I'm not a spammer. Just a person with opinions. :)
 
 Scott Dale Robison


You sure fooled me! :)))))


/gf

Jun 07 2003

Jan Knepper <jan smartsoft.us> writes:

Scott Dale Robison wrote:

 Jan Knepper wrote:
 To actually *fight* SPAM what would make sence is report SPAM
 ASAP at http://www.spamcop.net/ as that results into more than
 just reporting. One of the great features is that once a lot
 people start reporting a certain SPAM spamcop will at the
 originating IP address to bl.spamcop.net which can be used by
 email receiving servers (SMTP servers) to block incoming email
 if it comes from one of the many blocked IP addresses.
 Unfortunately, most people just seem to delete SPAM and most
 email providers do not seem to use bl.spamcop.net for email
 blocking.

 I agree with 99.99% of what you wrote, this being the one part I
 (partially) disagree with. Sure, SpamCop (and other similar services)
 can prove valuable, but they have some serious potential downfalls. The
 single biggest one, IMO, is that many spam-blocking services don't care
 about the source of an email. If it is reported as spam, they have no
 obligation to confirm it. I personally know of cases where actual
 *documentation* of a persons opt-in was completely and utterly ignored.
 The person in question didn't bother trying to opt-out (note: after
 having opt'ed-in), they just reported the 'spam' to SpamCop and the
 'offending' mail server was black-holed. Note: I realize this is just my
 word against their's, and I don't expect anyone to just assume I'm
 right. I'm just sharing a personal experience and it's worth exactly
 what you're paying for it.

I know... I have experienced that as well.
That is indeed one of the unfortunate sides of spamcop.net

 I guess the point I'm trying to make is, if you want to use SpamCop or
 any other similar service, feel free. Just realize that these entities
 are no more regulated than the spammers they claim to want to stop, and
 sometimes an agenda may slip through. After all, their value is in
 blocking email. So what if sometimes legitimate email gets blocked?

I *only* use spamcop for those email that are *SPAM*.
Legitimate email never got blocked as spamcop only begins blocking after a
certain treshold has been reached, at least I have never heard complaints
about it...

 No, I'm not a spammer. Just a person with opinions. :)

I agree. I just stated that spamcop provides a service, not that it is
perfect ;-)

Jan

Jun 07 2003

Scott Dale Robison <scott-news.digitalmars.com isdr.net> writes:

Jan Knepper wrote:
 I *only* use spamcop for those email that are *SPAM*.
 Legitimate email never got blocked as spamcop only begins blocking after a
 certain treshold has been reached, at least I have never heard complaints
 about it...

I've never heard complaints from a user of SpamCop, to be fair. Only a 
person who had a newsletter stopped going to an entire domain because of 
SpamCop. Yet another note: I will admit that it is *possible* that the 
email in question was technically spam ... unfortunately, we can never 
know as no one would even *try* to opt out or followup the original opt 
in. In any case, I'm convinced that *if* the offender was guilty, it was 
unintentional.

Also, to be fair, I was once guilty of unintentionally running an open 
relay, but the 'good samaritan' that caught me at it was nice enough to 
remove me from their open relay database once I closed it. I do 
recognize that most of these people are good guys ... I'm just concerned 
when so many people on the net don't think through the potential 
problems of taking someone elses word on what is or is not a spamming IP.

 I agree. I just stated that spamcop provides a service, not that it is
 perfect ;-)

Fair enough. Apologies if I was offensive in any way. :)

</soapbox>

SDR

Jun 07 2003

Jan Knepper <jan smartsoft.us> writes:

Scott Dale Robison wrote:

 I've never heard complaints from a user of SpamCop, to be fair. Only a
 person who had a newsletter stopped going to an entire domain because of
 SpamCop. Yet another note: I will admit that it is *possible* that the
 email in question was technically spam ... unfortunately, we can never
 know as no one would even *try* to opt out or followup the original opt
 in. In any case, I'm convinced that *if* the offender was guilty, it was
 unintentional.

Oh, I have seen those complaints MANY times. People that actually opted-in
themselves and than in time get sick of SPAM, find spamcop and start reporting
everything that comes into their mailbox not remembering wether or not they
subscribed for it or not. Spamcop is very aware of this as well.

 Also, to be fair, I was once guilty of unintentionally running an open
 relay, but the 'good samaritan' that caught me at it was nice enough to
 remove me from their open relay database once I closed it. I do
 recognize that most of these people are good guys ... I'm just concerned
 when so many people on the net don't think through the potential
 problems of taking someone elses word on what is or is not a spamming IP.

The internet professionals are usually very tolerant and helpful, at least,
that's my experience. What did you use? sendmail???

Well, that's exactly the problem with the Internet at this moment. It's like
trying to drive your car on the hiway with people around you that do not have a
license... <sigh>

 I agree. I just stated that spamcop provides a service, not that it is
 perfect ;-)

 Fair enough. Apologies if I was offensive in any way. :)

Nag!

ManiaC++
Jan Knepper

Jun 09 2003

Scott Dale Robison <scott-news.digitalmars.com isdr.net> writes:

Jan Knepper wrote:
 The internet professionals are usually very tolerant and helpful, at least,
 that's my experience. What did you use? sendmail???

I think I was running Xmail at the time, though I'm not 100% certain. 
It's been a long time...

SDR

Jun 09 2003

D Programming

C/C++ Programming

Other

c++.chat - interesting spam trap