Terms and frequencies?  
Author Message
E Robinson





PostPosted: Windows Desktop Search Development, Terms and frequencies? Top

Hi,

After running a query and getting the results back, is it possible to get the terms and frequencies for each document

Thanks,
Eric


Windows Search Technologies2  
 
 
globelin





PostPosted: Windows Desktop Search Development, Terms and frequencies? Top

You can use a hashtable to count the word frequency for each your document...



 
 
E Robinson





PostPosted: Windows Desktop Search Development, Terms and frequencies? Top

So, is there a column that contains the indexed terms for the documents


 
 
globelin





PostPosted: Windows Desktop Search Development, Terms and frequencies? Top

I may use the word-docNo as the HashTable Key,

And count the tf

I think that you are doing some IR jobs..



 
 
E Robinson





PostPosted: Windows Desktop Search Development, Terms and frequencies? Top

See the post at: http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=516694&SiteID=1

It describes all of the columns that are defined in WDS 2.6.5. Note that there is a column called "contents", but it is not retrievable.

I see no definition for any column that contains the list of indexed terms for the document. What I am asking is: "Is there a way to retrieve a list of indexed terms for the documents that are returned as a part of a query "

I don't want to have to go open each file and essentially duplicate the work that the indexing engine has already accomplished.

Eric

 
 
Paul Nystrom - MSFT





PostPosted: Windows Desktop Search Development, Terms and frequencies? Top

Hello E Robinson,

There currently isn't a way to retrieve this information. You can query on the collumn Characterization (System.Search.AutoSummary in 3.0), but that will only give you the terms from the beginning of the document.

If you want access to all of the content you can instantiate and invoke the IFilter to grab the content stream again.

Paul Nystrom - MSFT