亚洲欧洲中文日韩久久AV乱码,亚洲精品人成无码中文毛片,亚洲国产欧美国产综合一区

最新2.9的IndexWriter 建立方式：

Directory directory = new SimpleFSDirectory(new File(path),new SimpleFSLockFactory()); // 先要建立directory
IndexWriter writer = new IndexWriter(directory,new WhitespaceAnalyzer(), cover,IndexWriter.MaxFieldLength.UNLIMITED); // 這里最大字段長度無限（大字段是content），cover為true表示覆蓋寫用于初始化，false用于更新，這里就用 WhitespaceAnalyzer()分詞器
Directory directory = new SimpleFSDirectory(new File(path),new SimpleFSLockFactory()); // 先要建立directory
IndexWriter writer = new IndexWriter(directory,new WhitespaceAnalyzer(), cover,IndexWriter.MaxFieldLength.UNLIMITED); // 這里最大字段長度無限（大字段是content），cover為true表示覆蓋寫用于初始化，false用于更新，這里就用 WhitespaceAnalyzer()分詞器

IndexWriter 參數調整

writer.setMergeFactor(50); // 多少個合并一次
writer.setMaxMergeDocs(5000); // 一個segment最多有多少個document．nbsp;
writer.setMergeFactor(50); // 多少個合并一次
writer.setMaxMergeDocs(5000); // 一個segment最多有多少個document．/font>

把其他格式轉化為lucene需要的document．式

document．doc = new document．); //每一個doc相當于數據庫的一條記錄
doc.add(new Field("uid", line.getUid().toString(), Store.YES,Index.NO)); //每一個field，相當于數據庫的字段

doc.add(new Field("title", line.getTitle(), Store.NO,Index.ANALYZED));
doc.add(new Field("content", line.getContent(),Store.NO, Index.ANALYZED));
document．doc = new document．); //每一個doc相當于數據庫的一條記錄
doc.add(new Field("uid", line.getUid().toString(), Store.YES,Index.NO)); //每一個field，相當于數據庫的字段

doc.add(new Field("title", line.getTitle(), Store.NO,Index.ANALYZED));
doc.add(new Field("content", line.getContent(),Store.NO, Index.ANALYZED));

向IndexWriter添加doc，可以插入多條doc

writer.adddocument．doc);
writer.adddocument．doc2);
writer.adddocument．doc3);
writer.adddocument．doc);
writer.adddocument．doc2);
writer.adddocument．doc3);

開始寫入（close的時候為實際寫入過程）

writer.close();
writer = null;
writer.close();
writer = null;

讀取寫入的索引數

writer.numDocs()
writer.maxDoc()
writer.numDocs()
writer.maxDoc()

在close之前可以進行優化（不建議在建立索引時候使用）

writer.optimize()

2、清空索引
Directory directory = new SimpleFSDirectory(new File(path),new SimpleFSLockFactory());
IndexWriter.unlock(directory); //關鍵是這一步要進行目錄解鎖，這里解的是write.lock鎖
IndexWriter writer = new IndexWriter(directory,new WhitespaceAnalyzer(), false,IndexWriter.MaxFieldLength.LIMITED);
writer.deleteAll(); //標識刪除全部
writer.optimize(); //這個步驟才是實際刪除的過程
writer.close();
Directory directory = new SimpleFSDirectory(new File(path),new SimpleFSLockFactory());
IndexWriter.unlock(directory); //關鍵是這一步要進行目錄解鎖，這里解的是write.lock鎖
IndexWriter writer = new IndexWriter(directory,new WhitespaceAnalyzer(), false,IndexWriter.MaxFieldLength.LIMITED);
writer.deleteAll(); //標識刪除全部
writer.optimize(); //這個步驟才是實際刪除的過程
writer.close();

3、刪除指定索引（和清空差不多）
writer.deletedocument．(new Term("uri", uri)); //這里是刪除term滿足條件的一條或多條
writer.deletedocument．(query); //這里是刪除一個查詢出來的內容
writer.deletedocument．(new Term("uri", uri)); //這里是刪除term滿足條件的一條或多條
writer.deletedocument．(query); //這里是刪除一個查詢出來的內容

4、更新索引
就是先刪除再添加的過程，沒有直接update的辦法

5、讀取建立的索引分詞
TermEnum terms = indexReader.terms(new Term(index, ""));
Term term = terms.term(); //獲取一條索引
term().field(); //獲取索引的field（字段名）
term().text(); //獲取索引的值
TermEnum terms = indexReader.terms(new Term(index, ""));
Term term = terms.term(); //獲取一條索引
term().field(); //獲取索引的field（字段名）
term().text(); //獲取索引的值

6、搜索
最新2.9的IndexSearcher 建立方式：

Directory directory = new SimpleFSDirectory(new File(path),new SimpleFSLockFactory());
IndexSearcher indexSearcher = new IndexSearcher(directory, true);
Directory directory = new SimpleFSDirectory(new File(path),new SimpleFSLockFactory());
IndexSearcher indexSearcher = new IndexSearcher(directory, true);

創建查詢條件（這里建一個最復雜的，根據多個限定條件查找，并且有的限定條件放在多個field中查找，有精確限定和范圍限定）

BooleanQuery bQuery = new BooleanQuery();
Query query1 = null, query2 = null, query3 = null;
BooleanClause.Occur[] flags = new BooleanClause.Occur[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD };
query1 = MultiFieldQueryParser.parse(params.get("keywords"),new String[] { "title", "content" }, flags, new WhitespaceAnalyzer());
bQuery.add(query1, Occur.MUST); //query1是把關鍵字分別在title和content中匹配！
query2 = new TermQuery(new Term("startgui", params.get("startgui")));
bQuery.add(query2, Occur.MUST); //query2是精確匹配
Long minPriceLong = Long.parseLong(params.get("minPrice"));
Long maxPriceLong = Long.parseLong(params.get("maxPrice"));
query5 = NumericRangeQuery.newLongRange("price", minPriceLong,
maxPriceLong, true, true);
bQuery.add(query5, Occur.MUST); //query3是按范圍匹配
BooleanQuery bQuery = new BooleanQuery();
Query query1 = null, query2 = null, query3 = null;
BooleanClause.Occur[] flags = new BooleanClause.Occur[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD };
query1 = MultiFieldQueryParser.parse(params.get("keywords"),new String[] { "title", "content" }, flags, new WhitespaceAnalyzer());
bQuery.add(query1, Occur.MUST); //query1是把關鍵字分別在title和content中匹配！
query2 = new TermQuery(new Term("startgui", params.get("startgui")));
bQuery.add(query2, Occur.MUST); //query2是精確匹配
Long minPriceLong = Long.parseLong(params.get("minPrice"));
Long maxPriceLong = Long.parseLong(params.get("maxPrice"));
query5 = NumericRangeQuery.newLongRange("price", minPriceLong,
maxPriceLong, true, true);
bQuery.add(query5, Occur.MUST); //query3是按范圍匹配

排序情況

SortField[] sortField = new SortField[] { SortField.FIELD_SCORE,new SortField(null, SortField.DOC, true) }; // 默認排序
SortField sortPriceField = new SortField("sortPrice",SortField.LONG, sortPrice);
sortField = new SortField[] { sortPriceField,SortField.FIELD_SCORE,new SortField(null, SortField.DOC, true) }; //按自定義價格排序
SortField[] sortField = new SortField[] { SortField.FIELD_SCORE,new SortField(null, SortField.DOC, true) }; // 默認排序
SortField sortPriceField = new SortField("sortPrice",SortField.LONG, sortPrice);
sortField = new SortField[] { sortPriceField,SortField.FIELD_SCORE,new SortField(null, SortField.DOC, true) }; //按自定義價格排序

2.9最新查詢方式，只是獲取id

TopFieldDocs docs = indexSearcher.search(query, null, indexSearcher.maxDoc(), new Sort(sortField));
ScoreDoc[] scoreDocs = docs.scoreDocs;
docCount = scoreDocs.length;
TopFieldDocs docs = indexSearcher.search(query, null, indexSearcher.maxDoc(), new Sort(sortField));
ScoreDoc[] scoreDocs = docs.scoreDocs;
docCount = scoreDocs.length;

加入分頁

List<document．gt; docList = new ArrayList<document．gt;();
int max = ((startIndex + pageSize) >= docCount) ? docCount : (startIndex + pageSize); // max防止arrayindexoutofbounds
for (int i = startIndex; i < max; i++) {
    ScoreDoc scoredoc = scoreDocs[i];
    document．doc = indexSearcher.doc(scoredoc.doc); // 新的使用方法
    docList.add(doc);
}
List<document．gt; docList = new ArrayList<document．gt;();
int max = ((startIndex + pageSize) >= docCount) ? docCount : (startIndex + pageSize); // max防止arrayindexoutofbounds
for (int i = startIndex; i < max; i++) {
ScoreDoc scoredoc = scoreDocs[i];
document．doc = indexSearcher.doc(scoredoc.doc); // 新的使用方法
docList.add(doc);
}

循環解析docList中的document．取所需要的值

doc.get("title");

...

7、關于分詞
注意建立索引和搜索時候的analyzer必須一致，而且建立索引和搜索時候目錄也要保持一致

lucene自帶的一些分詞器

StandardAnalyzer() 會按空格和標點符號劃分

WhitespaceAnalyzer() 會按空格劃分

中文分詞這里使用的是paoding的中文分詞

是先按詞庫劃分，當詞庫中不存在時按二分法進行劃分

發表于 2010-07-12 11:49 西瓜閱讀(516) 評論(0) 編輯收藏所屬分類: Lucene

Lucene 2.9.0 使用

常用鏈接

留言簿(2)

隨筆分類(116)

隨筆檔案(114)

文章分類(1)

文章檔案(1)

搜索

最新評論

閱讀排行榜

評論排行榜

西瓜地兒沈陽求職（java3年以上經驗）！ashutc@126.com
BlogJava \| 首頁 \| 發新隨筆 \| 發新文章 \| 聯系 \| 聚合 \| 管理	隨筆：114 文章：1 評論：45 引用：0