10-24-2017, 07:56 +0200
AuthorPost
Mirage  05-19-2004, 17:23   | [TCL] Getting the source code of a web site
Staff
(Moderator)
Member since 01/2004
54 Posts
Location: Zwickau, Germany
This is how to get source code of web site. The following script snippet requires the HTTP package (which is included in TCL 8.0 and any later version).                                                                                                                               
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
# load the http package
package require http

# send the http request, -timeout sets up a timeout to occur after the specified number of milliseconds, we use catch to avoid an abort of the script in case of an error when executing http::geturl (e.g. due to an unsupported url)
if {[catch { set token [http::geturl http://yourhost.com -timeout 3000]} error] == 1} {
    puts stdout "$error"
    return
    # check if the request was successful, if yes -> put the html source code into $data
} elseif {[http::status $token] == "ok"} {
    set data [http::data $token]
    # if a timeout has occurred, send "Timeout occured" to the standad output device
} elseif {[http::status $token] == "timeout"} {
    puts stdout "Timeout occurred"
    # send the error to the standard output device if there is one
} elseif {[http::status $token] == "error"} {
    puts stdout [http::error $token]
}
# last but not least, release the memory which was used for these operations
http::cleanup $token

This post was edited 2 times, last on 08-14-2009, 03:22 by thommey
Mettwurst  08-26-2004, 21:02   Homepage
(Moderator)

Avatar

Member since 04/2004
13 Posts
Location: Stuttgart/Germany
another one:                                                                                                                               
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

# load the http package
package require http

# send the http request, -timeout sets up a timeout to occur after the specified number of milliseconds, we use catch to avoid an abort of the script in case of an error when executing http::geturl (e.g. due to an unsupported url)
if {[catch { set token [http::geturl http://yourhost.com -timeout 3000]} error] == 1} {
    putcmdlog "$error"
    return
    # if the the site does not exist
} elseif {[http::ncode $token] == "404"} {
    putcmdlog "Error: [http::code $token]"
    # check if the request was successful, if yes -> put the html source code into $data
} elseif {[http::status $token] == "ok"} {
    set data [http::data $token]
    # if a timeout has occurred, send "Timeout occured" to the standad output device
} elseif {[http::status $token] == "timeout"} {
    putcmdlog "Timeout occurred"
    # send the error to the standard output device if there is one
} elseif {[http::status $token] == "error"} {
    putcmdlog "Error: [http::error $token]"
}
# last but not least, release the memory which was used for these operations
http::cleanup $token



same like mirage's but with check if the site exists

Edit: small bugs fixed
This post was edited 3 times, last on 08-14-2009, 03:22 by thommey
Mirage  08-27-2004, 15:52  
Staff
(Moderator)
Member since 01/2004
54 Posts
Location: Zwickau, Germany
Note that putcmdlog is a TCL command of Eggdrop.
sKy\  02-06-2006, 13:55  
(Moderator)

Avatar

Member since 05/2005
8 Posts
Location: Germany, Zeitz
# set website "www.yourhost.com"
# set content [web2data $website]
# returns:
# - if no error: the source of the website
# - if error: 0                                                                                                                               
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
proc web2data { website } {
    package require http
    # send the http request, -timeout sets up a timeout to occur after the specified number of milliseconds
    # we use catch to avoid an abort of the script in case of an error when executing http::geturl (e.g. due to an unsupported url)
    if { [catch { set token [http::geturl $website -timeout 3000]} error] } {
        putcmdlog "web2data: Error: $error"
        # if the the site does not exist
    } elseif { [http::ncode $token] == "404" } {
        putcmdlog "web2data: Error: [http::code $token]"
        # check if the request was successful, if yes -> put the html source code into $data
    } elseif { [http::status $token] == "ok" } {
        set data [http::data $token]
        # if a timeout has occurred, send "Timeout occured" to the standad output device
    } elseif { [http::status $token] == "timeout" } {
        putcmdlog "web2data: Timeout occurred"
        # send the error to the standard output device if there is one
    } elseif { [http::status $token] == "error" } {
        putcmdlog "web2data: Error: [http::error $token]"
    }
    # last but not least, release the memory which was used for these operations
    http::cleanup $token
    if { [info exists data] } {
        return $data
    } else {
        return 0
    }
}


# same like Mettwurst''s but as proc

[Not loaded: http://sky.tclhelp.net/sky.gif]
This post was edited 1 times, last on 08-14-2009, 03:23 by thommey
Advanced options for this topic:

Ignore this topic (Do not list this topic in the "unread topics" search. You are currently not ignoring this topic.)
Hide this topic (Hidden topics are not displayed in the topics list. This topic is currently not hidden.)
Go to forum

Unclassified NewsBoard 1.5.3-d | © 2003-4 by Yves Goergen