這幾天,我的第一個基于lucene的搜索搞好了,記載一下:
首先要有一個包包Jar的那個,可以到官方網站去下載,之后現研究一下這個包包由于現在是學習階段,就下了兩個版本1.4.3的和2.0的,lucene-2.0的留著以后開發用,lucene-1.4.3的學習用,畢竟到2.0 時代文件格式有很大的變化,包括生成的index格式都變化了,所以最好是兩個版本都有。在開發的時候導入這兩個包就行了,我開始真的不會,汗!我還以為和C++里面的一樣呢直接include,現在想起來狂暈,那時候剛開始寫Java連聲明個類都叫Cjavaclass,MFC的寫法,汗自己一個!定義變量還保留C的習慣_javaVar_,再汗一個,現在好多了。
步驟一:
先寫一個定義常量的*.java文件
public class Constants {
?public final static String INDEX_FILE_PATH = "C:\\Java\\lucene\\DataSource";
?public final static String INDEX_STORE_PATH = "C:\\Java\\lucene\\DataIndex";
}
用來存儲要建立索引的文件和存儲建好的索引存儲在什么地方
步驟二:
寫生成索引的類:
?//將要索引的文件構成一個Document對象,并添加一個域"content"
?public class LuceneIndex {
?//索引器
?private IndexWriter writer = null;
?// 初始化=====>構造函數
?public LuceneIndex() {
??try {
???writer = new IndexWriter(Constants.INDEX_STORE_PATH,new StandardAnalyzer(), true);
??} catch (Exception e) {
???e.printStackTrace();
??}
?}
?//將要索引的文件構成一個Document對象,并添加一個域"content"
?private Document getDocument(File f) throws Exception {
??Document doc = new Document();
??FileInputStream is = new FileInputStream(f);
??Reader reader = new BufferedReader(new InputStreamReader(is));
??doc.add(Field.Text("contents", reader));
??doc.add(Field.Keyword("path", f.getAbsolutePath()));
??return doc;
?}
?public void writeToIndex() throws Exception {
??File folder = new File(Constants.INDEX_FILE_PATH);
??if (folder.isDirectory()) {
???String[] files = folder.list();
???System.out.println("正在建立索引..........請等待");
???for (int i = 0; i < files.length; i++) {
????File file = new File(folder, files[i]);
????Document doc = getDocument(file);
????System.out.println("正在建立文件 : " + file + " 的索引");
????System.out.println("完畢");
????writer.addDocument(doc);
???}
??}
?}
?public void close() throws Exception {
??writer.close();
?}
?//測試用的主程序
?public static void main(String[] agrs) throws Exception {
??// 聲明一個LuceneIndex對象
??LuceneIndex indexer = new LuceneIndex();
??// 建立索引
??Date start = new Date();
??indexer.writeToIndex();
??Date end = new Date();
??System.out.println("建立索引完畢..........Thank you for Lucene");
??System.out.println("");
??System.out.println("消耗時間 " + (end.getTime() - start.getTime())
????+ " 毫秒");
??System.out.println("索引建立完畢");
??indexer.close();
?}
}
現在索引生成了,是這些文本的的全文索引用的索引文件
步驟三:
現在基礎都有了,要的就是搜索的累了,干嘛?寫個搜索類就是用來查詢??!
public class LuceneSearch {
?// 聲明一個IndexSearcher對象
?private IndexSearcher searcher = null;
?// 聲明一個Query對象
?private Query query = null;
?// 初始化構造函數
?public LuceneSearch() {
??try {
???searcher = new IndexSearcher(IndexReader.open(Constants.INDEX_STORE_PATH));
??} catch (Exception e) {
???e.printStackTrace();
??}
?}
?public final Hits search(String keyword) {
??System.out.println("正在檢索關鍵字 : " + keyword);
??// System.out.println(keyword);
??try {
???query = QueryParser.parse(keyword, "contents",
?????new StandardAnalyzer());
???System.out.println(query);
???Date start = new Date();
???Hits hits = searcher.search(query);
???Date end = new Date();
???System.out.println("檢索完成......." + " 用時 "+ (end.getTime() - start.getTime()) + " 毫秒");
???System.out.println(" ");
???return hits;
??} catch (Exception e) {
???e.printStackTrace();
???return null;
??}
?}
?public void printResult(Hits h) {
??if (h.length() == 0) {
???System.out.println(h);
???System.out.println(h.length());
???System.out.println("對不起,沒有找到您需要的結果");
??} else {
???for (int i = 0; i < h.length(); i++) {
????try {
?????Document doc = h.doc(i);
?????System.out.print("這是第 " + i + "個檢索結果,文件名為: ");
?????System.out.println(doc.get("path"));
????} catch (Exception e) {
?????e.printStackTrace();
????}
???}
??}
??System.out.println(" ");
??System.out.println("----------------------------------");
??System.out.println(" ");
?}
?
?public static void main(String[] args) throws Exception {
??LuceneSearch test = new LuceneSearch();
??Hits myHits1 = test.search("足球");
??Hits myHits2 = test.search("世界杯");
??test.printResult(myHits1);
??test.printResult(myHits2);
?}
}
步驟四:
運行LuceneIndex.java=====> 生成索引
運行LuceneSearch.java====>查詢關鍵字
ok,this is my first searcher!
Although this is very simple,it let me begin with luceneSearcher.Thanks lucene,Tanks Search!
Keep on studying knowledge of lucene and search,also and artificial intelligence!
I love this job!