09-09-2010, 03:29 +0200
Forum: Overviewworkscript announcements
A small call-it-a-lib script able to handle rss feeds
AuthorPost
SIYB  04-15-2009, 23:09   | RSS lib Jabber  Homepage        Quote  Reply
Member since 06/2007
7 Posts
A small script that will help writing applications that need to parse RSS(2.0) feeds. The most important (from my point of view) features of RSS2.0 have been included, please feel free to enhance the script and share your results if you feel that they are useful.

There is a small demo at the end of the script, it should cover the usage, don't forget to free the mallocs :>. Any questions? -> email / irc / thread


1  # Copyright (c) 2009 siyb <siyb |insert "you know what here[/quote] geekosphere.org>
2  #
3  # This software is provided 'as-is', without any express or implied
4  # warranty. In no event will the authors be held liable for any damages
5  # arising from the use of this software.
6  #
7  # Permission is granted to anyone to use this software for any purpose,
8  # including commercial applications, and to alter it and redistribute it
9  # freely, subject to the following restrictions:
10  #
11  #    1. The origin of this software must not be misrepresented; you must not
12  #    claim that you wrote the original software. If you use this software
13  #    in a product, an acknowledgment in the product documentation would be
14  #    appreciated but is not required.
15  #
16  #    2. Altered source versions must be plainly marked as such, and must not be
17  #    misrepresented as being the original software.
18  #
19  #    3. This notice may not be removed or altered from any source
20  #    distribution.
21  
22  
23  package require tdom
24  
25  namespace eval feedRead {
26      variable idPrefix "feedread";# the prefix to be used for feeds
27      variable formats [list rss]
28      variable tmpId;# stores the id temporarily (when a feed is added -> upvaring)
29      variable feedId 0;# the id counter
30      variable feedList [list];# list containing all feedids and the corresponding data
31      namespace eval handle {}
32  }
33  
34  #
35  # Enduser relevant
36  #
37  
38  # Creates a feed
39  proc feedRead::create {xmlData id} {
40      upvar 1 $id rid
41      # create domtree for the feed
42      set doc [dom parse -keepEmpties $xmlData]
43      set rid [feedRead::addFeed $doc];# add feed to feedlist and return the id
44  }
45  
46  # return the feedlist of all feeds currently active
47  proc feedRead::getFeedList {} {
48      return $feedRead::feedList
49  }
50  
51  # clean up (destroy) all feeds
52  proc feedRead::destroyAll {} {
53      foreach item [feedRead::feedList] {
54          feedRead::destroy [lindex $item 0]
55      }
56  }
57  
58  #
59  # Lib
60  #
61  
62  # deletes a feed from the list and gets rid of all traces
63  proc feedRead::destroy {id} {
64      set doc [$id infoDoc];# get the document
65      rename $id "";# delete the feed's procedure
66      set search [feedRead::findFeed $id]
67      set feedRead::feedList [lreplace $feedRead::feedList $search $search]
68      $doc delete
69  }
70  
71  # checks if a format is known to the script
72  proc feedRead::checkFormat {format} {
73      return [lsearch $feedRead::formats $format]
74  }
75  
76  # joins prefix and id
77  proc feedRead::joinId {id} {
78      return [append foo $feedRead::idPrefix $id]
79  }
80  
81  # Returns a feedread feedid
82  proc feedRead::mkid {} {
83      set id [feedRead::joinId $feedRead::feedId]
84      incr feedRead::feedId
85      return $id
86  }
87  
88  # find feed in list
89  proc feedRead::findFeed {id} {
90      return [lsearch -index 0 $feedRead::feedList $id]
91  }
92  
93  # replace feed in feedlist
94  proc feedRead::feedReplace {id replaceWith} {
95      set search [feedRead::findFeed $id]
96      set feedRead::feedList [lreplace $feedRead::feedList $search $search $replaceWith]
97  }
98  
99  # find and return feed from list
100  proc feedRead::returnFeed {id} {
101      return [lindex $feedRead::feedList [feedRead::findFeed $id]]
102  }
103  
104  # determine the feed type
105  proc feedRead::determineType {rootNode} {
106      return [$rootNode nodeName]
107  }
108  
109  # count the number of <item> of a feed
110  proc feedRead::countItems {id} {
111      return [llength [lindex [feedRead::returnFeed $id] end]]
112  }
113  
114  # returns an <item> by its place in the itemlist of the feed
115  proc feedRead::returnItemById {id iid} {
116      if {[feedRead::countItems $id] < $iid} {
117          error "Itemid '$iid' out of range"
118      }
119      return [lindex [lindex [feedRead::returnFeed $id] end] $iid]
120  }
121  
122  # adds a feed and the corresponding feed command
123  proc feedRead::addFeed {doc} {
124      upvar #0 feedRead::tempId id
125      set id [feedRead::mkid];# create a new id for this feed
126      set root [$doc documentElement]
127      set flist [list $id $doc $root [feedRead::determineType $root]]
128      set feedRead::feedList [lappend feedRead::feedList $flist];# add feed to feedlist
129      uplevel #0 {;# create the feed command
130          proc $feedRead::tempId {option {iid -1}} {
131              set id [string trim [dict get [info frame 0] proc] ::]
132              set format [feedRead::getFeedType $id]
133             
134              if {[feedRead::checkFormat $format] != -1} {
135                  feedRead::handle::${format} $id $option $iid
136              } else {
137                  error "'$format' not supported yet"
138              }
139          }
140      }
141      feedRead::feedReplace $id [lappend flist [feedRead::getChannelElement $id item -all]];# add all <item> to the feedlist
142      return $id
143  }
144  
145  # rss handler
146  proc feedRead::handle::rss {id option iid} {
147      switch $option {
148          "destroy"            { return [feedRead::destroy $id] }
149          "infoType"            { return [feedRead::getFeedType $id] }
150          "infoDoc"            { return [feedRead::getFeedDoc $id] }
151          "infoRoot"            { return [feedRead::getFeedRoot $id] }
152          "getChannelTitle"        { return [feedRead::getChannelElement $id title] }
153          "getChannelLink"        { return [feedRead::getChannelElement $id link] }
154          "getChannelDescription"        { return [feedRead::getChannelElement $id description] }
155          "getChannelLanguage"        { return [feedRead::getChannelElement $id language] }
156          "getChannelCopyright"        { return [feedRead::getChannelElement $id copyright] }
157          "getChannelManagingEditor"    { return [feedRead::getChannelElement $id managingEditor] }
158          "getChannelWebMaster"        { return [feedRead::getChannelElement $id webMaster] }
159          "getChannelPubDate"        { return [feedRead::getChannelElement $id pubDate] }
160          "getChannelLastBuildDate"    { return [feedRead::getChannelElement $id lastBuildDate] }
161          "getChannelCategory"        { return [feedRead::getChannelElement $id category] }
162          "getChannelGenerator"        { return [feedRead::getChannelElement $id generator] }
163          "getChannelDocs"        { return [feedRead::getChannelElement $id docs] }
164          "getChannelCloud"        { return [feedRead::getChannelElement $id cloud] }
165          "getChannelTtl"            { return [feedRead::getChannelElement $id ttl] }
166          "getChannelImage"        { return [feedRead::getChannelElement $id image] }
167          "getChannelRating"        { return [feedRead::getChannelElement $id rating] }
168          "getChannelTextInput"        { return [feedRead::getChannelElement $id textInput] }
169          "getChannelSkipHours"        { return [feedRead::getChannelElement $id skipHours] }
170          "getChannelSkipDays"        { return [feedRead::getChannelElement $id skipDays] }
171          "getChannelItems"        { return [feedRead::getChannelElement $id item -all] }
172          "getChannelItemLength"        { return [feedRead::countItems $id] }
173          "getItemTitle"            { return [feedRead::getItemElement $id $iid title] }
174          "getItemLink"            { return [feedRead::getItemElement $id $iid link] }
175          "getItemDescription"        { return [feedRead::getItemElement $id $iid description] }
176          "getItemAuthor"            { return [feedRead::getItemElement $id $iid author] }
177          "getItemCategory"        { return [feedRead::getItemElement $id $iid category] }
178          "getItemComments"        { return [feedRead::getItemElement $id $iid comments] }
179          "getItemEnclosure"        { return [feedRead::getItemElement $id $iid enclosure] }
180          "getItemGuid"            { return [feedRead::getItemElement $id $iid guid] }
181          "getItemPubDate"        { return [feedRead::getItemElement $id $iid pubDate] }
182          "getItemSource"            { return [feedRead::getItemElement $id $iid source] }
183          default                { error "Unknown option ($option)" }
184      }
185  }
186  
187  #return the feed's type
188  proc feedRead::getFeedType {id} {
189      return [lindex [split [feedRead::returnFeed $id]] 3]
190  }
191  
192  # return the feed's document
193  proc feedRead::getFeedDoc {id} {
194      return [lindex [split [feedRead::returnFeed $id]] 1]
195  }
196  
197  # return the feed root
198  proc feedRead::getFeedRoot {id} {
199      return [lindex [split [feedRead::returnFeed $id]] 2]
200  }
201  
202  # get channel
203  proc feedRead::getChannel {id} {
204      set root [feedRead::getFeedRoot $id]
205      return [$root getElementsByTagName channel]
206  }
207  
208  # get a channel element
209  proc feedRead::getChannelElement {id chanItem {option -single}} {
210      set channel [feedRead::getChannel $id]
211      set ci [$channel getElementsByTagName $chanItem]
212      switch $option {
213          "-single"     {
214              foreach item $ci {;# loop all titles, if the parentNode of the title matches the channel, return it's text value
215                  if {[$item parentNode] == $channel} { return [$item asText] }
216              }
217          }
218          "-all"     {
219              return $ci
220          }
221          default    { error "bad option \"$option\": must be -all or -single" }
222      }
223  }
224  
225  # get an <item> element
226  proc feedRead::getItemElement {id iid itemElement} {
227      set item [[feedRead::returnItemById $id $iid] getElementsByTagName $itemElement]
228      if {$item == ""} { return "" }
229      if {[llength $item] > 1} {;# of there is more than one element by that name
230          set rlist [list]
231          foreach i $item {
232              set rlist [lappend rlist [$i asText]]
233          }
234          return $rlist
235      } else {
236          return [$item asText]
237      }
238  }
239  
240  #
241  # documentation demonstration
242  #
243  proc feedRead::demo {feed} {
244  
245      package require http
246     
247      set token [::http::geturl $feed -timeout 2000]
248      set data [::http::data $token]
249      ::http::cleanup $token
250  
251      # lets create the feed first, $data is the feed's data
252      feedRead::create $data id
253      puts "Feedlist: [feedRead::getFeedList]";# print feedlist
254  
255      # some information on the feed
256      puts "\nfeed information:\n"
257      puts "    [$id infoType]";# the type of the doc
258      puts "    [$id infoDoc]";# the document identifier
259      puts "    [$id infoRoot]";# the root node identifier of the doc
260  
261      # channel related information -> [$id getChannelTitle] for instance
262      set l {
263          "getChannelTitle"
264          "getChannelLink"
265          "getChannelDescription"
266          "getChannelLanguage"
267          "getChannelCopyright"
268          "getChannelManagingEditor"
269          "getChannelWebMaster"
270          "getChannelPubDate"
271          "getChannelLastBuildDate"
272          "getChannelCategory"
273          "getChannelGenerator"
274          "getChannelDocs"
275          "getChannelCloud"
276          "getChannelTtl"
277          "getChannelImage"
278          "getChannelRating"
279          "getChannelTextInput"
280          "getChannelSkipHours"
281          "getChannelSkipDays"
282          "getChannelItems"
283          "getChannelItemLength"
284      }
285      puts "\nChannel related information:\n"
286      foreach item $l {
287          puts "    $item: '[$id $item]'"
288      }
289  
290      # item related stuff
291      puts "\nitem related stuff:\n"
292      set l {
293          "getChannelItemLength"
294          "getItemTitle"
295          "getItemLink"
296          "getItemDescription"
297          "getItemAuthor"
298          "getItemCategory"
299          "getItemComments"
300          "getItemGuid"
301          "getItemPubDate"
302          "getItemSource"
303      }
304      foreach item $l {
305          puts "    $item: '[$id $item 0]'"
306      }
307  
308      # now that we are done we need to destroy all data related to the feed in order to prevent memory from going apeshit
309      $id destroy
310  }
This post was edited 1 times, last on 04-21-2009, 20:10 by SIYB
  The user has attached a file: feedRead.tcl (Save, 9,831 Bytes, downloaded 7 times)
CyBex  04-16-2009, 00:24   Jabber  ICQ  AOL IM  Yahoo IM  MSN  Homepage        Quote  Reply
nie da
(Administrator)

Avatar

Member since 01/2004
236 Posts
Location: Berlin
can you add a download file to the posting? the forum has problems with backslashes.. :-)

mfG CyBex @ QuakeNet, UnderNet, EFNet, Freenode, euIRC, GameSurge & ShadowWorld - Files
SIYB  04-21-2009, 20:12   Jabber  Homepage        Quote  Reply
Member since 06/2007
7 Posts
Quote by CyBex on 04-16-2009, 00:24:
can you add a download file to the posting? the forum has problems with backslashes.. :-)

done :>
Advanced options for this topic:

Ignore this topic (Do not list this topic in the "unread topics" search. You are currently not ignoring this topic.)
Hide this topic (Hidden topics are not displayed in the topics list. This topic is currently not hidden.)
Go to forum

Unclassified NewsBoard 1.5.3-d | © 2003-4 by Yves Goergen