Remove Ad

Friday, August 31, 2007

Teamsite Content Storage Repository Strategies

After your content is created in TeamSite, the next question you have to face is, where is it deployed? What alternatives do you have for content repository; and what are their advantages and disadvantages?

The following subsections show the mainstream alternatives for a content repository, and including the alternative's advantages and disadvantages.

The following different approaches will be discussed in relation to a desire to have a portlet that shows a list of articles of some subject. For each of the articles it should be possible to click on them or somehow select to view the full contents of the article.

Approach: File system only

A content repository could consist solely of a file system. TeamSite renders each content piece as an HTML fragment and deploys the fragment to a file system accessible by WebSphere Portal. WebSphere Portal publishes whatever is available on the file system using a portlet to simply read the file and add it to the response object to display for the user.

Presentation

As the portlet reads and responds with the contents of the file, the presentation is controlled by Teamsite. This means that WebSphere Portal has no possibility to affect the presentation of individual articles. Creating an index page, of all articles in a directory or in total, requires scanning the file system for available files.

Personalization

Because only the rendered content pieces are deployed, there is no data that the portlet can use for any kind of personalization. Individual articles are impossible to personalize because they are already rendered for presentation. Personalization of article listings (for example, indexes of articles) requires each file to be visited and parsed for possible meta-data values.

Advantages

1. Full use of Teamsite data capture capabilities
2. Full use of TeamSite presentation template capabilities
3. Only one presentation template is needed. The same one can be reused for both preview and publishing.
4. The implementation of the portlet needed to display individual content pieces is pretty simple and requires little runtime processing
5. Simpler deployment from TeamSite, using only OpenDeploy

Disadvantages

1. Linking between individual content pieces is virtually impossible. Because all presentation markup, including links, are generated by TeamSite there is no way to invoke portlet actions, or other WebSphere Portal commands, in the HTML fragment. Therefore, no standard or scalable method for changing a content piece in the portlet exists.
2. Because of the linking problem (described above) the index page need to be created by the portlet. To find all available article the portlet need to scan the filessystem, which requires a lot of IO access.
3. No content filtering can be used. For example letting the user select a topic or only presenting articles based upon certain user attributes is impossible or very impractical.
For example we might want a portlet that shows all articles relevant to the city the user lives in. In order to determine if an article is suitable or not the portlet would have to scan each file to find the metadata on which to do the filtering.
4. No access control for content pieces is available. The lack of metadata means that the portlet has nothing that it can use to determine if a certain user is allowed to see an article.
Again imagine that the portlet doing the listing should only a show some articles if the user is member of the gold customer group. The metadata for determining what group an article is suitable for is only available inside the file requiring the same solution as for the previous point.
5. No presentation personalization can be used. Since the presentation of individual articles is controlled by TeamSite, it is not possible (or at least not practical) to present the content based upon user attributes.
One might for example only let certain user groups post and read comments about a specific article. When TeamSite generates the markup there is no mechanism to hide the feedback form for the non-authorized groups.
6. You cannot use any WebSphere Portal functionality, such as adding person awareness within the articles.

File system only approach summary

Using a file system only for the repository greatly limits deployment. You cannot implement linking between content in a supported way, or personalization. Furthermore, you cannot have access control at the content piece level. This solution is only recommended for a small proof of concept to show functionality of and data exchange between Teamsite and WebSphere Portal.

Approach: Combination of file system and database

In this scenario the content repository consists of two parts: a file system, where individual articles are deployed as fragments, and a database, where metadata for each piece of content is deployed.

Teamsite generates JSP files and deploys them to the file system. This differs from the previous scenario where HTML fragments are deployed. Each piece of content also has a set of metadata that TeamSite deploys to a database.

On the WebSphere Portal side, you use a portlet which checks user data and compares its relevant data to the metadata. Depending on the defined rules, the portlet presents the content customized by his or her user profile.

For example, the country of the user, such as Sweden, is stored in the user object and is, therefore, available to the portlet. The portlet queries all content metadata and displays only that content whose country metadata element indicates Sweden.

Advantages

1. Content contributors can use Teamsite templating and the TeamSite graphical user interface (GUI) to enter and preview content.
2. The different types of links (see previous section) are possible to have. The presentation template used by TeamSite must include relevant JSP custom tags which, when executed in WebSphere Portal, can create the right dynamic links.
3. Content filtering is possible. Because metadata is kept in the database, the portlet can query it for content which matches the user's attributes or selection.
4. Access control is possible. Similar to previous point, with metadata available it is possible for the portlet to determine if an article should be displayed for a user or not.
5. No performance intensive database calls to display content pieces.

Disadvantages

1. The deployment of the data gets more complicated in Teamsite. You deploy JSP files to the file system, using OpenDeploy, and metadata to the database, using DataDeploy. The deployment scripts need to be written in such a way that metadata is only written to the database if the corresponding file exists on the file system.
2. You need two presentation templates: one for previewing in TeamSite (using HTML), and one for generating the JSP file, used by WebSphere Portal.
3. The JSP files need to be deployed to a directory residing inside the expanded portlet WAR file on the WebSphere Portal Server machine; otherwise the portlet cannot pick up these files to display them. Typically: /installedApps/.ear/.war, where is the directory of the WebSphere Application Server and is name of the portlet application combined with an ID created at installation time.
4. No presentation personalization can be used because the presentation is controlled by TeamSite, and it is not possible (or at least not practical) to present the content based on user attributes.

Combination approach summary

Using a combination of file system and database for your content repository gives you much more flexibility then deploying static HTML files from a file system-only repository. Personalization is possible in the sense that you have content filtering level. The development becomes more complex because you need a database schema and the portlet responsible for presentation contains more logic. This solution could certainly be used for a production system, especially if performance requirements are high because the number of database calls for content presentation is low.

Important: All files in /installedApps/.ear/.war are deleted when a portlet application is updated. This is relevant primarily when you are developing; in production, you do not expect the portlet to be updated very often.

Approach: Database only

In this scenario the content repository consists of a database only. TeamSite deploys content as well as metadata to a database, using DataDeploy. WebSphere Portal uses a portlet to pick up the right content and present it to the user in an appropriate way, using the metadata for content filtering and personalization.

Advantages

1. All content is stored in one place.
2. You use only one TeamSite deployment program, DataDeploy, to send content and metadata to the database.
3. Presentation personalization is possible. TeamSite only stores raw data in the database; therefore, no information about presentation is included. Presentation is controlled by the portlet and all available information about the user can influence the presentation.
4. Content filtering is possible because the metadata for content is kept in the database is possible to query it for content matching the user's attributes

Disadvantages

1. Two different techniques to create the GUI are required. The preview functionality in TeamSite requires a presentation template and, for WebSphere Portal Server, a JSP picks up the data from the database and constructs the GUI.
2. A more complex development effort is required. The portlet has more logic associated with it and a proper database schema needs to be constructed.
3. More performance intensive, because more database access is required, for every data element stored there.

Database only summary

This approach is the most flexible out of the three. If presentation personalization is a requirement, you should consider this approach. Even if a personalization requirement does not exist, consider using this approach anyway because it makes adding new functionality and making changes easier in the future.

Search is easily facilitated with this approach because it would be possible to search through the data itself and the metadata as well. For example, find all documents which are written for the country Sweden.

More development work is required upfront and full-blown personalization comes at the cost of lower throughput (because of the increased number of database calls).

1 comment:

Aishwarya said...

Excellent goods from you, man. I’ve understand your stuff previous to and you’re just too excellent. I actually like what you’ve acquired here, certainly like what you are stating and the way in which you say it.
Document Management Software
Document Management Software India
Document Management Software Chennai

Visitors Since 28 June 2005

Enterprise Content Management System