Friday, August 15, 2014

test

Monday, September 17, 2012

Switching to QT...

I have had some requests for my InSite application from Mac users.  About the same time that I got the requests, I saw couple of very intriguing YouTube demos of QT here and here.

Needless to say I am going to port the work I have done in VB.net over to QT so I can have a Liberated Distribution of the final app.

I will be starting with this base code --> QT Base RSS Aplication.

I will post more as I add libraries to deal with Social Networking sites and add the full 2D grid view to my Aggregator...

Monday, May 3, 2010

Source Code to Detect RSS Feed Types Explained

The code may be hard to read here on Blogger, but a link to the full source is provided at the end of this article




This article assumes you have set up the CRSSFeed class from a previous post. However, this code can be easily adapted for any application with the change of a few variable names. I strongly recommend using an object oriented approach to this problem as it will enable you to RSS enable any other apps you may develop. For more info on the CRSSFeed VB.Net class click here.



After pouring over the specs for the various RSS Feeds out there, I determined that I first needed to find out 3 attributes of the Document Element to begin. These are xmlns, xmlns:rdf, and version. In fact, xmlns:rdf is so important, I may add a member variable of my class to store it for use when I build my dictionary of metadata later. Here is my first lines of RSS detection code:



Dim xmlns As String = m_XMLDocument.DocumentElement.GetAttribute("xmlns")
Dim xmlns_rdf As String = m_XMLDocument.DocumentElement.GetAttribute("xmlns:rdf")
Dim rss_version As String = m_XMLDocument.DocumentElement.GetAttribute("version")

Next, we need to process the name of the Document Element. The possible values are RDF:rdf, channel, rss, and feed. XML is case sensitive, so I do an LCase on the Document Element name. This yields the following Case structure:



Select Case LCase(m_XMLDocument.DocumentElement.Name)
Case "rdf:rdf", "channel"
' Process RSS 1.0 here
Case "rss"
' Process RSS 2.0 here
Case "feed"
' Process Atom 1.0 here
Case Else
' Unknown format handler
End Select

In the RSS 1.0 handler, we detect the possible versions of 1.0, 0.90 and 1.1 by processing the xmlns:rdf and xmlns attribute. Note the RDF namespace must be "http://www.w3.org/1999/02/22-rdf-syntax-ns#" for a valid RSS 1.0 type feed:



m_type = RSSType.RSSType_RDF ' Set the member variable to the RDF type of feed
' Check xmlns attribute for RSS version
If xmlns_rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" Then
m_version = Switch(xmlns = "http://purl.org/rss/1.0/", "1.0", _
xmlns = "http://channel.netscape.com/rdf/simple/0.9/", "0.90", _
xmlns = "http://purl.org/net/rss1.1#", "1.1")
' RDF/RSS 1.0 spec http://web.resource.org/rss/1.0/spec#s5.2
' RSS 0.90 spec http://www.rssboard.org/rss-0-9-0
' RSS 1.1 spec http://inamidst.com/rss1.1/
m_channel.Title = m_XMLDocument.DocumentElement.GetElementsByTagName("title").Item(0).InnerText
m_channel.Link = m_XMLDocument.DocumentElement.GetElementsByTagName("link").Item(0).InnerText
m_channel.Description = m_XMLDocument.DocumentElement.GetElementsByTagName("description").Item(0).InnerText
DetectFeedType = Not IsNothing(m_version) ' make sure we had a valid XML Namespace
Else
DetectFeedType = False
End If

RSS 2.x and 0.91-0.94 are easier, just grab the version attribute:



m_type = RSSType.RSSType_RSS_2_x ' Set the memeber variable for RSS 2.0 type of feed
If Not rss_version = vbNull Then ' If the RSS 2 spec ever changes we can change our code here to detect
m_version = rss_version
m_channel.Title = m_XMLDocument.DocumentElement.GetElementsByTagName("title").Item(0).InnerText
m_channel.Link = m_XMLDocument.DocumentElement.GetElementsByTagName("link").Item(0).InnerText
m_channel.Description = m_XMLDocument.DocumentElement.GetElementsByTagName("description").Item(0).InnerText
DetectFeedType = True
End If

In this code, I just check for Atom 1.0. If for some reason you find you need to check for 0.3 or so forth, do so here. I don't bother with 0.3 because it is obsolete, but who knows what you might find out on the web. Note that there can be more than one link so we need to process the rel attribute to find the one that has the value of "alternate" and is a child of the element "feed":



Dim node As XmlNode
m_type = RSSType.RSSType_Atom_1_x
m_version = "1.0"
m_channel.Title = m_XMLDocument.DocumentElement.GetElementsByTagName("title").Item(0).InnerText
' There can be more than one link in an Atom 1.0 header. Need to find The one where rel="alternate"
For Each node In m_XMLDocument.DocumentElement.GetElementsByTagName("link")
If node.Attributes.GetNamedItem("rel").Value = "alternate" And node.ParentNode.Name = "feed" Then
m_channel.Link = node.Attributes.GetNamedItem("href").Value
Exit For ' Found it, let's bail...
End If
Next
m_channel.Description = m_XMLDocument.DocumentElement.GetElementsByTagName("subtitle").Item(0).InnerText
DetectFeedType = True
' TODO: Find Atom Verson here. As code stands now assumes 1.0
' Atom 1.0 spec http://tools.ietf.org/html/rfc4287#section-1.1

Now we set our version string for our object and we are done:



If DetectFeedType Then
m_version = Choose(m_type, "RDF", "RSS", "ATOM") & " " & m_version
Title = m_channel.Title
Link = m_channel.Link
Description = m_channel.Description
End If

The next post will be putting the link, title and description in the exposed properties of our object. Now that we have detected the feed type, processing the rest of the feed is a snap. Until next time, Happy Computing!!!


To view the full code, click here.

Friday, April 23, 2010

Syndication Formats Demystified (Kind Of)

CNET described the motivation of its creators as follows: "Winer's opponents are seeking a new format that would clarify RSS ambiguities, consolidate its multiple versions, expand its capabilities, and fall under the auspices of a traditional standards organization." (Wiki)

This statement led me to develop the following Equation:

Where:

  • C0 is the initial complexity of the problem
  • C1 is the final complexity of the problem
  • n is the number of engineers simplifying the problem
  • e is the exponential constant
C1 = C0en

This would seem to give us the nice e-curve that describes the situation.

To decode RSS files we must first understand the format and discover how they are unique. There are 8 formats of RSS divided into three distinct categories as follows. The names of each format links back to the specification page for future reference.

The RDF (or RSS 1.*) branch includes the following versions:

  • RSS 0.90 was the original Netscape RSS version. This RSS was called RDF Site Summary, but was based on an early working draft of the RDF standard, and was not compatible with the final RDF Recommendation.
  • RSS 1.0 is an open format by the RSS-DEV Working Group, again standing for RDF Site Summary. RSS 1.0 is an RDF format like RSS 0.90, but not fully compatible with it, since 1.0 is based on the final RDF 1.0 Recommendation.
  • RSS 1.1 is also an open format and is intended to update and replace RSS 1.0. The specification is an independent draft not supported or endorsed in any way by the RSS-Dev Working Group or any other organization.

The RSS 2.* branch (initially UserLand, now Harvard) includes the following versions:

  • RSS 0.91 is the simplified RSS version released by Netscape, and also the version number of the simplified version originally championed by Dave Winer from Userland Software. The Netscape version was now called Rich Site Summary; this was no longer an RDF format, but was relatively easy to use.
  • RSS 0.92 through 0.94 are expansions of the RSS 0.91 format, which are mostly compatible with each other and with Winer's version of RSS 0.91, but are not compatible with RSS 0.90.
  • RSS 2.0.1 has the internal version number 2.0. RSS 2.0.1 was proclaimed to be "frozen", but still updated shortly after release without changing the version number. RSS now stood for Really Simple Syndication. The major change in this version is an explicit extension mechanism using XML namespaces.

From Wikipedia ? http://en.wikipedia.org/wiki/RSS#Variants

Atom 1.0 

From Wikipedia ? http://en.wikipedia.org/wiki/Atom_(standard)#Initial_work

My next post will show and explain the code I used to detect what feed format I'm working with and how to use that information to add the Title, Link, and Description to the CRSSItems collection. Needless to say, the detection code is longer than the decoding code.  

Monday, April 19, 2010

A Class for RSS Data

The first goal of decoding the various forms of RSS files is to create a base class to encapsulate the functionality and data of the RSS file.  The basic structure of an RSS file is a channel with a title, link, description, and several items with titles, links,and descriptions.  This statement should make the initial structure of the base class obvious.  It is also good to note that other data besides these three fields can be present in the various formats.  For instance, the channel may contain a webmaster email address or other such items.  Such information is mostly metadata by definition.  We should add functionality for dealing with this data if we so choose in any instance of the class.  With this in mind, I created the following CRSSItem class in the Class Designer:


I decided to handle metadata by creating a Collection object to store the data as Name Value pairs and a Flag to tell the class whether that data should be captured or ignored.

Private m_col_metadata As Collection
Private m_hasmetadata As Boolean = False

The MetaInfo and SetMetaInfo Methods expose the m_col_metadata Collection to the calling application.  The MetaInfo method takes a key representing the name of metadata and returns the corresponding value from the Collection.

Public Function MetaInfo(ByVal Key As String) As String
       MetaInfo = m_col_metadata(Key)
End Function

The SetMetaInfo simply adds or replaces metadata info for a given key.

Public Sub SetMetaInfo(ByVal key As String, ByVal data As String)
     If m_col_metadata.Contains(key) Then
            m_col_metadata.Remove(key)
     End If
     m_col_metadata.Add(data, key)

End Sub

Not shown in the diagram are the OnError and Progress events I added to enhance communication to the calling app.  The OnError event allows the application to handle all errors from which the class can't correct itself.  The Progress event can be used to pass status information back to the app in the case of long process locking calls.  This allows for status bar updates and such.

Now that we have created our base class, we can use VB class inheritance to create a class to process the RSS feed and store the data into a collection of RSSItems.



CRSSFeed inherits from CRSSItem using the following statement:

Inherits CRSSItem

I added an Items property to expose the collection of RSS items to the calling application.  The collection of RSS Items is a Generic.Dictionary collection created like this:

Private m_col_items As New System.Collections.Generic.Dictionary(Of Integer, CRSSItem)


Public ReadOnly Property Items() As System.Collections.Generic.Dictionary(Of Integer, CRSSItem)
     Get
           Items = m_col_items
     End Get

End Property

This allows for a call to the Values member of Items for For...Each looping in the calling app.

I also added a Version property to relay what we find out about the version of the RSS file back.  There is also a Create method which takes a URL as an argument.  In upcoming posts I will explain the various formats of RSS files such as RDF, RSS 2.0, and Atom and we will develop our Create Method to put data from these files into the class structure.

Sunday, April 18, 2010

Is RSS Really That 'Simple'

XML provides the underlying framework of Web 2.0 applications. A key application of XML is website syndication or what was once termed 'push' technology. RSS stands for Real Simple Syndication and theoretically provides the following items in XML format:

  • Title
  • Link
  • Description

In practice, however, this idea has been extended ad infinitum due to the extensibility of RSS and XML.  In essence, there are about 9 ways to encapsulate these three items in an XML file.  This leads to a robust use of the concept without any bullet-proof method of decoding RSS files.  I have read many bulletin board postings made by programmers seeking the best ways of decoding RSS to create a reader application, but have found many of the answers leaving only more questions on the subject.  I believe the best approach to the problem is the Object Oriented solution.

A RSS Reader application in its most basic form would:

  1. Retrieve an RSS file from a website
  2. Display a list of Titles from the file
  3. Display the Description of a given Title
  4. Provide the Link back to the underlying web page

The VB.Net object XMLDocument has the ability to retrieve an XML file directly from the web, making the first step relatively simple.  Add the following to the General Declarations section of your class:

Imports System.Xml   

Then declare a member of the type XMLDocument as a member of your class:

Private m_XMLDocument As New XmlDocument

To load a file, call the Load method passing the RSS feed URL as an argument:

m_XMLDocument.Load("http://mydomain.com/myfeed.rss") ' Create an XML Document to work with

Now that the RSS feed is loaded into the XMLDocument object processing is not as straightforward as one would think.  This is because of the various versions of RSS files than can be used.  The solution to this problem would suggest an Object Oriented design approach.  In future posts on this blog I will share the object model I used to solve the problem, explain the various formats, explain how to find the various information in various formats, and show how to handle extension information as Meta Data within the classes using dictionaries based on the name spaces. 

My hopes are this blog will help programmers better utilize Object Oriented Design, build applications that deal with RSS effectively, and lead to a mastery of Web 2.0 to lead us into the creation of Web 3.0 applications.