I had been wondering if this was possible as of lately. After not seeing any examples on the web I decided to tackle the task myself.
I wanted to place some files (doc, xls, ppt, htm, txt) into SQL Servers BLOB Field and then expose them to SharePoint 2007 (MOSS). This would allow me to search the BLOB data through the SharePoint seach functionality. Pretty much, I wanted expose a document stored in SQL Server to enable it for searching on MOSS.
A lot of the time applications will store the file path to the file and then store the actual file on a file server. This isnt something I wanted to do, mainly because I was only going to be storing, maybe, 30-50 documents a month. Each of these documents were maybe 1 to 2 pages long, nothing huge. Therefore I decided to keep it with my SQL data, in the datbaase as a BLOB field. This has many advantages, you can keep certain documents with your custom systems. Therefore, if you decide to move the SQL Database later on down the road, you dont have to copy the file share to another server and possibly update the database records with their new location.
So here's what I needed to do:
1. Create a BLOB field in a database table that could be full text searched.
2. Create the Full-Text Index
3. Insert some test documents
4. Create the query to search within the blob documents. (still in sql at this point).
5. Expose the SQL Table information to MOSS through the Business Data Catalog.
6. Allow the content to be searched from within the MOSS site.
After that, WE'RE DONE!
HOW TO DO THIS
Blob Table
I created the SQL Table with the following SQL Code:
CREATE TABLE [dbo].[BinaryTable](
[Row_Id] [int] IDENTITY(1,1) NOT NULL,
[Blob_Type] [char](3) NOT NULL,
[Attachment] [varbinary](max) NOT NULL,
[File_Name] [varchar](100) NULL,
CONSTRAINT [PK_BinaryTable] PRIMARY KEY CLUSTERED
(
[Row_Id] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
Please note that the Blob_Type is actually the file extension of the document. SQL Server only supports searching the following BLOBS:
- .doc
- .txt
- .xls
- .ppt
- .htm
From BLOBS in Special Operations in MSDN:
Full-text index and search operations
You can index and search certain types of data stored in BLOB columns. When a database designer decides that a table will contain a BLOB column and the column will participate in a full-text index, the designer must create, in the same table, a separate character-based data column that will hold the file extension of the file in the corresponding BLOB field. During the full-text indexing operation, the full-text service looks at the extensions listed in the character-based column, applies the corresponding filter to interpret the binary data, and extracts the textual information needed for indexing and querying.
When a field in a BLOB column contains documents with one of the following file extensions, the full-text search service uses a filter to interpret the binary data and extract the textual information.
• .doc
• .txt
• .xls
• .ppt
• .htm
The extracted text is indexed and becomes available for querying.
Create the Full Text Index
I created a full text catalog.

Then I defined the full text index.

The table was now ready for data.
Inserting Blob Data.
I followed this example to insert a few documents into the system (just alter the connectionstring and column names. I manually altered it by hand an inserted some documents within a few moments. This was for testing only, in a real world app you'd create a robust method for inserting documents into the system.
The documents I inserted all had greeked text for testing purposes.
Querying the Blob.
Once the blob data was inserted, I was then able to query it with this command.
SELECT Row_Id, File_Name FROM BinaryTable WHERE (CONTAINS (Attachment, 'Lorem'))
This searched all the BLOBS for 'Lorem'.
After I got the results I was looking for it was then time to expose it to MOSS.
Exposing the BLOB data to MOSS/SharePoint 2007
In order to expose the data, I had to expose certain metadata entities in the database and certain methods (queries) that allowed SharePoint to understand the search request. This was done through the Business Data Catalog.
This metadata definition had to be built by hand to do enable this functionality. I'm not going to get into how and why things are where they are, all of that info can be found here (expect to spend a good day, or two, to wrap your head around this metadata model).
Here's the code:
BlobSearch.xml
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
- <LobSystem xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schemas.microsoft.com/office/2006/03/BusinessDataCatalog BDCMetadata.xsd" Type="Database" Version="1.0.0.6" Name="BlobSearchSample" xmlns="http://schemas.microsoft.com/office/2006/03/BusinessDataCatalog">
- <Properties>
<Property Name="WildcardCharacter" Type="System.String">%</Property>
</Properties>
- <LobSystemInstances>
- <LobSystemInstance Name="BlobSearchSampleInstance">
- <Properties>
<Property Name="AuthenticationMode" Type="System.String">WindowsCredentials</Property>
<Property Name="DatabaseAccessProvider" Type="System.String">SqlServer</Property>
<Property Name="RdbConnection Data Source" Type="System.String">TESTSERVER</Property>
<Property Name="RdbConnection Initial Catalog" Type="System.String">BlobTest</Property>
<Property Name="RdbConnection Integrated Security" Type="System.String">SSPI</Property>
<Property Name="RdbConnection Pooling" Type="System.String">false</Property>
<Property Name="SsoApplicationId" Type="System.String">BlobTest</Property>
<Property Name="SsoProviderImplementation" Type="System.String">Microsoft.SharePoint.Portal.SingleSignon.SpsSsoProvider, Microsoft.SharePoint.Portal.SingleSignon, Version=12.0.0.0, Culture=neutral,PublicKeyToken=71e9bce111e9429c</Property>
</Properties>
</LobSystemInstance>
</LobSystemInstances>
- <Entities>
- <Entity EstimatedInstanceCount="10000" Name="File">
- <Properties>
<Property Name="Title" Type="System.String">File_Name</Property>
</Properties>
- <Identifiers>
<Identifier Name="Row_Id" TypeName="System.Int32" />
</Identifiers>
- <Methods>
- <Method Name="GetFiles">
- <Properties>
<Property Name="RdbCommandText" Type="System.String">SELECT Row_Id, File_Name FROM BinaryTable WHERE (CONTAINS (Attachment, @textToSearchFor))</Property>
<Property Name="RdbCommandType" Type="System.Data.CommandType">Text</Property>
</Properties>
- <FilterDescriptors>
- <!-- Define the filters supported by the back-end method (or sql query) here.
-->
- <FilterDescriptor Type="Wildcard" Name="TextToSearchFor">
- <Properties>
<Property Name="UsedForDisambiguation" Type="System.Boolean">true</Property>
</Properties>
</FilterDescriptor>
</FilterDescriptors>
- <Parameters>
- <Parameter Direction="In" Name="@textToSearchFor">
- <TypeDescriptor TypeName="System.String" AssociatedFilter="TextToSearchFor" Name="TextToSearchFor">
- <DefaultValues>
<DefaultValue MethodInstanceName="FileFinderInstance" Type="System.String">%</DefaultValue>
<DefaultValue MethodInstanceName="FileSpecificFinderInstance" Type="System.String">%</DefaultValue>
</DefaultValues>
</TypeDescriptor>
</Parameter>
- <Parameter Direction="Return" Name="Files">
- <TypeDescriptor TypeName="System.Data.IDataReader, System.Data, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" IsCollection="true" Name="FileDataReader">
- <TypeDescriptors>
- <TypeDescriptor TypeName="System.Data.IDataRecord, System.Data, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" Name="FileDataRecord">
- <TypeDescriptors>
<TypeDescriptor TypeName="System.Int32" IdentifierName="Row_Id" Name="Row_Id" />
<TypeDescriptor TypeName="System.String" Name="File_Name" />
</TypeDescriptors>
</TypeDescriptor>
</TypeDescriptors>
</TypeDescriptor>
</Parameter>
</Parameters>
- <MethodInstances>
<MethodInstance Name="FileFinderInstance" Type="Finder" ReturnParameterName="Files" />
<MethodInstance Name="FileSpecificFinderInstance" Type="SpecificFinder" ReturnParameterName="Files" />
</MethodInstances>
</Method>
</Methods>
</Entity>
</Entities>
</LobSystem>
Now that it is exposed I was able to import the Metadata into the BDC on SharePoint, also known as Adding an Application Definition to the Business Data Catalog (click link to learn how to import it into SharePoint)
Searching within SharePoint
Now that the data is defined through the metadata model in SharePoint I can actually search it. I tested this by adding a Business Data List to a page in SharePoint.
I was then able to search for "Lorem" and results came back.

I then searched for another key word "Test" that I put in a single document, and that worked as well. See the screen shots below.

Conclusion
You could expand this functionality by adding it to the Content Sources and Search Scopes in SharePoint to allow for enterprise search too. But I think I'll save that for another day. :) Or just watch the videos provided in the links.
And that folks, is how you expose BLOB data to SharePoint Search.