A quick response to the messages from Jim Muckerheide and Fytton Rowland:
All servers that I am aware of do maintain a record of download addresses.
This does present serious privacy issues, and as a result there are very
few servers that make their logs widely available.
To answer another part of the question, server logs would be of very limited
use in producing "discussion lists" and the like. The reason is that these
logs are not as informative as one would like for such purposes (which is
a relief to many privacy advocates and a hindrance to direct marketers and
the like). What server logs do is record the IP address of the machine
that requested a page, and this address looks like 135.207.225.12. One
can then use "reverse DNS lookup" to try to find out what machine that is.
Here is where the serious problems start. Quite a few such lookups fail,
and no information is generated about the IP address. (One can then try
to do other things, such as examine registries of autonomous systems, etc.,
but even that is of limited use, and let's skip it.) When the lookup
succeeds, you get information that varies in its utility. Some of the
addresses will be of the form
john-smith-pc_at_physics.harvard.edu
which suggests the request came from John Smith's PC in the Harvard Physics
Dept. (But even that is not certain, since this PC may have been passed on
to a student of John Smith.) Others, such as
156.cambridge-06-07rs.ma.dial-access.att.net
will tell you the request came from a dial-in customer of the AT&T WorldNet
ISP business, and that the modem bank is located in Cambridge, Mass.
It won't tell you who was using that PC, though. (For that you would need
to access the WorldNet logs, which are carefully guarded for privacy reasons.)
The next time you see that address, a different person might be using it.
Next, many requests come from addresses that look like
proxy1.questnet.net.au
which are proxies that hide any number of users behind them. None of these
entries produce valid email addresses.
One of the complications in studying server logs is that you can never be
certain you have seen all accesses to a page. For example, if many people
are going through proxy1.questnet.net.au to access your pages, this proxy
will almost certainly cache (store a local copy) at least some of those
pages, and then deliver them to requesters without leaving any trace
on your server.
All these technical difficulties make it hard to evaluate usage in a
meaningful way.
Andrew Odlyzko
************************************************************************
Andrew Odlyzko amo_at_research.att.com
AT&T Labs - Research voice: 973-360-8410
http://www.research.att.com/~amo fax: 973-360-8178
************************************************************************
Received on Wed Feb 10 1999 - 19:17:43 GMT