Perform a database Craw from a collection its something very common. With IBM Watson Explorer this is something very easy to do. In my example, Ill create a collection and will perform a simple query in a IBM DB2 database, but, the steps will be very similar for other databases, you just need to keep in mind that you will need the correct driver.
1- Put the driver in place:
Get the database jdbc and put in the correct folder, usually it is something like /opt/IBM/dataexplorer/WEX-11_0_2/Engine/lib/java/database/.
2- Create the collection copying defaults from default:
3- Add a new seed, this is where your collection will get data:
4- Choose Database:
5- Enter your database settings and the query that will be performed:
6- Its done, now you can test:
7- This can take a while depending on your query and connection, but when it finish, it will show some rows that the query returned in the following format. To see some row data, click Crawler XML:
8- Here is your data:
9- Now that we see that its working, you can start your craw. This step will feed your collection and can take a good time depending on your amount of data:
10-You must see Craw activity:
11- You can query your collection now to test, just enter your term and click search in the left options:
12- You will see something like this:
Thats it, you have created a collection that get data from Database!
Eventually we need to enable search using wildcards like * for a collection at Watson Explorer. For sure this can make our queries consume more CPU and Memory, you can think comparing a query that perform a “select … where field = ‘XXX'” against a query that perform a “select …. where field like ‘*XXX'” (pseudo code). What will be faster? So, think carefully before enable this!
To enable, go to your collection configuration -> Indexing -> Term expansion support (4) , and check Generate Dictionaries.