To help them reach their goal, I worked on some of the files needed to create the tables in the database and the synonym file. The campus file, the simplest of the tables for the database, was already created by the lead developer; I followed the format to create the files for discipline and category.
![]() |
The campus file. |
The synonyms may be inserted into the custom analyzer, as the example below shows, or it may be read from a file. Either way, I was able to create a file that can be used externally or copy and pasted into the analyzer.
![]() |
Synonym format. |
The way this file is set up would contract a search. There is another format that would expand a search, which is helpful if there is no controlled vocabulary or the approved term has changed over time. (In that case, the format would be 'united states => united states, usa' and a search for 'united states' would match either 'united states' or 'usa'.)
It has been very useful working on these files. I had read the documentation, but working with the data and values helps clarify what it all means. I have a much better understanding of what is happening with the analyzers from working on this.
I'm really excited to see the project coming together. There should be at least a prototype available to show at the end of the semester when I have to give my presentation.
No comments:
Post a Comment